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Chapter  1 


Introduction 


This  thesis  presents  and  analyzes  algorithms  that  solve  a  variety  of  long  standing 
problems  in  non-intrusive  power  system  monitoring.  Techniques  from  number  theory 
and  algebra  are  applied  to  solve  common  power  system  monitoring  problems,  such  as 
accurately  determining  the  harmonic  content  of  the  current  drawn  by  an  electrical  load 
given  only  a  coarsely  quantized  version  of  that  current  and  claissifying  an  unknown  load 
on  the  basis  of  the  current  drawn  by  that  load.  The  methods  of  this  thesis  can  be 
applied  to  a  variety  of  other  common  discrete-time  signal  processing  tasks  that  involve 
computation  of  the  Discrete  Fourier  Transform  (DFT)  of  a  signal. 

Conventional  sub-metering  of  individual  electrical  loads  to  detect  problems  and 
conduct  energy  score- keeping  has  long  been  costly  and  inconvenient.  A  nagging  problem 
for  over  two  decades  has  been  that  these  costs  increase  swiftly  as  data  requirements 
become  increasingly  complex:  “the  high  cost  of  equipment  continues  to  limit  the  amount 
of  [usage]  data  utilities  can  collect.  Additional  drawbacks  of  the  equipment  now  available 
for  collection  of  end-use  load  survey  data  range  from  their  cost,  reliability,  and  flexibility 
to  intrusion  into  the  customer’s  activities  and  premises”  [19]. 
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Computational  power  and  data  transmission  capabilities  for  commercial  monitor¬ 
ing  and  control  systems  have  out-paced  the  problem  of  putting  sensors  in  all  the  right 
places.  Various  kinds  of  high-speed  data  networks  provide  convenient  remote  access  to 
control  inputs  and  system  operating  information  for  embedded  control  and  monitoring 
systems.  Similarly,  microprocessors  and  associated  technologies  for  these  systems  have 
achieved  astounding  price/performance  ratios.  Obtaining  useful  information,  however, 
generally  requires  proper  installation,  maintenance,  and  interpretation  of  a  vast  collection 
of  sensors  a  daunting  proposition  even  if  the  sensors  are  mass  produced,  micro-miniature, 
and  individually  inexpensive. 

A  Non-Intrusive  Load  Monitor  (NILM)  can  determine  the  electrical  operating 
schedule  of  a  collection  of  loads  from  a  single  measurement  of  aggregate  current  flowing 
to  the  loads.  The  NILM  addresses  the  “sensor  problem”  for  electric  load  monitoring  by 
extracting  information  about  individual  loads  from  limited  measurements  at  an  easy-to- 
access,  centralized  location  [1].  For  example,  the  NILM  can  disaggregate  and  report  the 
operation  of  individual  electrical  loads  like  lights  and  motors  from  measurements  made 
only  at  the  electric  meter  where  service  is  provided  to  a  building.  The  NILM  is  capable 
of  performing  this  disaggregation  even  when  many  loads  are  operating  at  the  same  time. 
Because  the  NILM  associates  observed  electrical  waveforms  with  individual  kinds  of  loads, 
it  is  possible  to  exploit  modern  state  and  parameter  estimation  algorithms  to  remotely 
verify  and  determine  the  condition  or  health  of  critical  loads  ([20]  describes  techniques 
suitable  for  motor  parameter  estimation  from  a  non-intrusive  monitor,  for  example.).  The 
NILM  has  the  potential  to  be  a  turn-key,  enabling  platform  for  future  energy  conservation 
and  monitoring  in  a  smart  grid  that  services  both  homes  and  commercial/industrial 
facilities. 

A  NILM  makes  us  of  the  “spectral  envelope”  representation  of  observed  current 
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signals.  This  scheme  considers  samples  i[n]  of  a  current  i{t),  where  a  set  of  samples 
are  taken  for  each  period  of  the  line  voltage  waveform.  The  DFT  of  the  set  of  samples 
corresponding  to  each  period  is  then  computed.  This  produces  a  set  of  DFT  coefficients 
for  each  period.  A  spectral  envelope  is  the  time  evolution  of  a  single  DFT  coefficient.  This 
can  be  a  very  flexible  basis  for  computing  and  tracking  all  sorts  of  nseful  metrics  about 
power  consumption.  Spectral  envelopes  estimate  real  and  reactive  power  consnmption 
and  harmonic  content.  The  algorithms  presented  in  this  thesis  can  be  applied  to  a  variety 
of  useful  spectral  envelope  calculations. 

When  working  with  a  continuons-time  signal  i{t),  it  is  often  desirable  to  examine 
its  discrete-time  samples  i[n].  In  any  practical  application,  it  is  impossible  to  obtain  z[n] 
to  infinite  resolution.  Instead,  only  the  quantized  valnes  i[n\  are  available,  where  i[n]  is 
simply  i[n]  quantized  to  some  finite  number  of  bits  of  resolution.  While  the  DFT  of  this 
quantized  signal  can  be  redily  computed,  this  is  not  a  perfectly  accurate  statement  of  the 
true  frequency  content  of  the  unquantized  signal  z[n].  Unfortunately,  since  quantization  is 
a  many-to-one  operation,  it  is,  in  general,  impossible  to  exactly  reconstruct  i[n]  from  i[n], 
and  thus  it  is  also  impossible  to  exactly  determine  the  DFT  of  i[n]  from  2[?7.],  because  the 
DFT  is  a  bijection.  However,  with  additional  information  about  the  structure  of  i[n],  it 
is  possible  to  obtain  a  significantly  more  accurate  estimate  of  the  true  frequency  content 
of  i[n].  Additional  constraints  about  z[n]  restrict  the  class  of  possible  i[n[  that  could  have 
produced  the  observed  i[n].  Chapter  2  demonstrates  an  efficient  algorithm  that  exploits 
the  structure  of  the  mapping  between  regions  of  frequency  space  and  qnantized  values 
to  improve  the  estimation  of  the  frequency  content  of  i[n]  by  applying  these  additional 
constraints. 

Chapter  3  considers  the  related  question  of  using  knowledge  of  the  values  of  some 
particular  subset  of  frequency  components  to  estimate  the  values  of  other  frequency 
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components.  Of  course,  due  to  the  orthogonality  of  frequency  components,  the  value 
of  one  component  is  complete  independent  from  the  value  of  any  other  component,  so 
no  non-trivial  answer  can  be  given  to  this  question,  in  general.  However,  if  certain 
additional  constraints  about  i[n]  are  known,  then  it  will  be  possible  to  use  information 
about  one  set  of  frequency  components  to  estimate  the  value  of  another  set.  The  types 
of  constraints  that  make  this  possible  are  considered.  An  initial  algorithm  is  presented 
that  applies  those  constraints  to  perform  this  estimation.  This  algorithm  will  be  shown 
to  be  numerically  unstable,  and  a  refined  algorithm  will  be  considered  that  completely 
avoids  numerical  problems  by  using  properties  of  cyclotomic  fields. 

In  Chapter  4,  the  problem  of  identifying  an  electrical  load  from  a  subset  of  the 
frequency  content  of  its  measured  current  is  considered.  The  relationship  between  the 
structure  of  the  currents  drawn  by  different  loads  in  a  class  and  the  minimal  subset 
of  frequency  content  needed  to  unambiguously  identify  a  single  load  in  that  class  is 
examined.  An  algorithm  for  performing  this  classification  task  is  discussed. 

Chapter  5  details  the  design  and  operation  of  an  implementation  of  an  embedded 
system  that  takes  quantized  samples  i[n]  of  a  current  signal  and  computes  the  corre¬ 
sponding  spectral  envelopes.  The  system  is  capable  of  delivering  this  information  in  a 
variety  of  convenient  forms,  including  via  WiFi. 
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Chapter  2 


Quantization  Effects  on  the  DFT 


2.1  Spectral  Envelopes 

The  spectral  envelopes  of  current  represent  the  harmonic  content  of  the  input 
waveform  for  each  line-locked  period  of  the  service  voltage.  Given  N  samples  f[n]  of  a 
waveform  i{t)  over  one  period,  the  samples  can  be  expressed  in  terms  of  their  spectral 
content  by 


where  the  spectral  envelope  values  pk  and  for  that  period  are  defined  as 

.r  ,  .  f  2'Kkn\ 

Pfe  =  2j^Wsm  f j  (2.2) 

n=0  ' 


and 
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Here,  k  denotes  the  multiple  of  the  line  frequency  to  which  a  particular  spectral  envelope 
corresponds;  for  example,  on  a  60Hz  utility  service,  k  =  1  corresponds  to  the  60  Hz 
component  and  A:  =  3  to  the  180  Hz  component.  The  values  of  these  spectral  envelopes 
are  calculated  for  each  period  of  the  line  voltage;  the  values  at  period  m  will  be  denoted 
Pk[m]  and  qk[m].  With  this  definition,  spectral  envelopes  can  naturally  be  calculated 
from  the  real  and  imaginary  parts  of  the  Discrete  Fourier  Transform  (DFT)  [2]  of  i[n] 
over  each  period  of  the  line  voltage. 

The  complete  collection  of  coefficients  Pk  and  Qk,  for  all  A:,  determine  the  signal 
i[n]  over  one  cycle.  The  spectral  envelope  values  can  be  understood  to  have  meaningful 
physical  interpretations.  For  example,  if  the  line  voltage  waveform  consists  of  only  a 
single  pure  sinusoid,  then  pi  corresponds  to  the  real  power  consumed,  and  qi  to  the 
reactive  power. 


2.2  Quantization 

In  any  practical  application,  it  is  not  possible  to  obtain  samples  i[n]  of  the  wave¬ 
form  i{t)  to  infinite  precision.  Instead,  only  quantized  samples  are  generally  available.  A 
quantizer  maps  points  in  a  continuous  interval  to  a  discrete  set  of  points.  The  continu¬ 
ous  interval  is  partitioned  into  a  set  of  regions,  called  quantization  intervals,  by  a  set  of 
points,  called  boundary  points  or  interval  endpoints.  Each  interval  has  a  value  associated 
with  it;  these  values  are  called  representation  points.  The  quantizer  maps  each  value  in  a 
quantization  interval  to  the  corresponding  representation  point.  To  formally  specify  the 
operation  of  a  quantizer,  let  M  G  N  and  define  two  sets  of  points  A  =  {oi, . . . ,  om}  and 
B  =  {6o, . . . ,  5m}  where  aj,  bj  G  R  Vj  and  bj  <  bj+i  Vj  G  [0,  M  —  1].  The  elements  of  the 
set  A  are  the  representation  points  and  the  elements  of  set  B  are  the  boundary  points. 
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Set  B  specifies  a  collection  of  M  regions,  i?i, . . . ,  Rm  where  jRi  =  (6o,  ^i],  R2  =  (bi,  ^2], 
-R3  =  (^2,^3])  ■■■,  Rm  =  Define  the  quantizer  function,  Q  :  (6o)^m]  B, 

such  that  Q{x)  =  bj  where  x  G  Rj.  In  the  above  definition,  we  restrict  the  domain  of 
the  quantizer  to  the  interval  (60,  ^m]-  The  operation  of  the  quantizer  is  left  undefined  for 
inputs  outside  of  all  quantization  intervals.  An  alternate  approach  would  be  to  define 
the  first  and  last  quantization  intervals  to  be  Ri  =  {—00,  bi]  and  Rm  =  {bM-i,  00],  which 
would  have  the  advantage  of  having  the  entire  real  line  as  a  domain.  However,  this  def¬ 
inition  is  not  used  here  because  unbounded  quantization  intervals  would  unnecessarily 
complicate  the  analysis  without  providing  any  benefit  because  any  practical  application 
involves  quantizing  values  in  a  finite  interval. 

Let  i[n]  denote  the  quantized  samples  of  the  waveform  i{t).  That  is  to  say,  i[n]  = 
Q{i[n\).  In  an  analogous  fashion  to  the  above,  these  quantized  samples  can  be  expressed 
in  terms  of  their  spectral  content  by 


N-l 


i\n\ 


1  f _  .  f  2'Kkn\  _  ( 2nkn\ 

=  V  A  J  ) 


fc=0 


where  the  spectral  envelopes  and  for  that  period  are  defined  by 


and 


(2,4) 


(2.5) 


(2.6) 


It  is  useful  to  consider  the  relationship  between  the  pk’s  and  g^’s  that  would  be 
desirable  to  have  and  the  Pf.’s  and  gj.’s  that  one  be  obtained  in  practice.  To  this  end. 
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first  notice  that  there  is  a  one-to-one  relationship  between  the  set  of  all  PkS  and  QkS 
and  the  N  samples  i[n\  (because  the  DFT  is  a  bijection).  Similarly,  there  is  a  one-to-one 
relationship  between  the  set  of  all  pj,’s  and  qf.’s  and  the  N  quantized  samples  i[n].  On 
the  other  hand,  there  is  a  many-to-one  relationship  between  the  N  unquantized  samples 
i[n]  and  the  N  quantized  samples  i[n]  (the  quantizer  maps  many  unquantized  values  to 
the  same  quantized  value,  because  all  points  in  a  quantization  interval  are  mapped  to 
the  same  representation  point).  Thus,  there  is  a  many-to-one  relationship  between  the 
set  of  all  PkS  and  Qk’s  and  the  set  of  all  Pk’s  and  g^’s,  which  means  it  is  impossible  to 
uniquely  reconstruct  the  p^’s  and  qkS  from  the  p^’s  and  g^’s. 

Despite  this  non-uniqueness  problem,  one  can  still  consider  how  accurately  the 
Pit’s  and  qk’s  can  be  estimated  from  the  Pk’s  and  g^’s.  One  simple  method  is  to  use  each 
Pit  as  the  estimate  for  the  corresponding  pk  and  each  q/.  as  an  estimate  for  the  g^f  Fig.  2.1 
shows  Pi  values  as  a  function  of  actual  pi  values  for  a  pure  60  Hz  in-phase  sinusoid  of 
varying  amplitudes,  with  N  =  128  and  4-bit  samples.  Values  of  pi  are  marked  with  a 
+  .  The  line  Pi  —  pi  is  included  for  reference  to  illustrate  how  close  each  actual  Pi  value 
is  to  the  corresponding  pi  value.  Clearly,  using  Pi  as  an  estimate  for  pi  is  reasonably 
accurate,  though  there  is  noticable  error. 

It  is  possible  to  obtain  better  estimates  for  the  pk’s  and  qk’s.  As  noted  above, 
there  is  a  one-to-one  relationship  between  the  set  of  all  pk’s  and  qk’s  and  the  N  samples 
in  time  i[n\.  For  notational  convenience,  let  the  space  of  all  possible  values  of  the  pk’s 
and  qk’s  be  referred  to  as  FQ-space.  Clearly,  FQ-space  is  isomorphic  to  because  each 
frequency  component  can  have  an  arbitrary  real  value.  Also,  as  noted  above,  many  sets 
of  N  samples  i[n]  map  to  the  same  set  of  N  samples  i[n];  thus,  many  points  in  FQ-space 
map  to  the  same  set  of  N  samples  i[n\.  These  points  form  a  region  in  FQ-space.  The 
following  lemma  states  certain  useful  properties  of  these  regions. 
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Figure  2.1:  vs  pi. 


Lemma  1.  Let  A  be  the  set  of  points  in  PQ -space  that  corresponds  to  some  arbitrary  set 
of  quantized  samples  i[n].  Then  A  has  the  following  properties: 

1.  A  is  connected. 

2.  A  is  convex. 

3.  A  is  a  bounded  polytope. 

Proof.  1.  Assume,  for  purpose  of  contradiction,  that  ^  is  not  connected.  By  definition, 
this  means  that  A  can  be  expressed  as  the  union  of  two  non-empty  open  sets,  as 
shown  in  Fig.  2.2.  We  can  select  three  colinear  points,  x,  y,  z,  as  shown  in  the  figure. 
Consider  moving  along  the  line  segment  from  x  to  y  to  z.  Each  point  in  PQ-space 
has  a  corresponding  unique  set  of  unquantized  samples,  i[n].  Thus,  as  we  move 
along  this  line  segment,  the  corresponding  set  of  unquantized  samples  change.  To 
be  precise,  the  direction  specified  by  motion  along  the  line  segment  corresponds  to 
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a  particular  ratio  of  spectral  components  (the  pk’s  and  gfc’s).  Moving  in  a  given 
direction  causes  the  unquantized  samples  to  change  by  adding  spectral  content  in 
the  specified  ratio.  For  example,  if  both  x  and  y  lied  along  the  pi  axis,  then  moving 
along  the  line  segment  causes  the  pi  content  of  the  samples  to  change,  but  all  other 
spectral  content  is  left  unaffected.  Similarly,  if  x  and  y  lie  in  the  piQi  plane  along 
the  line  p\  =  2qi  then  moving  along  the  line  segment  causes  the  pi  and  qi  content 
to  change  (with  the  pi  content  changing  by  twice  as  much  as  the  qi  content),  and 
all  other  spectral  content  to  remain  the  same. 

Every  point  in  A  corresponds  to  the  same  collection  of  quantized  samples,  i[n\.  This 
means  that  as  we  move  along  the  line  segment  the  first  time  an  unquantized  sample 
will  cross  a  boundary  point  of  any  quantization  interval  is  at  the  boundary  of  the  set 
containing  x  and  y.  At  this  point,  some  sample,  say  i[k],  will  leave  the  quantization 
interval  whose  representation  point  is  i[A:]  and  move  to  either  the  quantization 
interval  immediately  above  or  the  one  immediately  below.  In  particular,  if  the 
added  spectral  content  at  sample  index  k  (in  the  ratio  specified  by  moving  along 
the  line  segment  from  x  to  y)  is  positive  (that  is  to  say,  if  we  consider  the  time 
domain  samples  corresponding  to  just  the  added  spectral  content  from  moving 
along  the  line  segment,  and  the  sample  at  index  k  is  positive)  then  this  sample 
moves  to  the  quantization  interval  immediately  above  and  if  the  added  spectral 
content  at  sample  index  k  is  negative  then  this  sample  moves  to  the  quantization 
interval  immediately  below.  The  added  spectral  content  cannot  be  0  at  sample 
index  k  because,  if  it  were,  then  motion  along  the  line  segment  would  not  cause 
i[k]  to  move  at  all,  which  contradicts  the  fact  that  i[k]  crossed  a  boundary  point 
of  a  quantization  interval.  In  any  case,  whichever  direction  i[k]  moved  along  the 
path  from  x  to  y,  it  must  move  in  the  same  direction  along  the  path  from  y  to  z 
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Figure  2.2:  A  is  connected. 


and  so  i[k]  can’t  return  to  the  original  quantization  interval  at  z.  Thus,  i[k]  would 
be  in  different  quantization  intervals  at  x  and  2:.  This  means  that  x  and  ^  must 
correspond  to  different  sets  of  quantized  samples,  which  contradicts  the  definition 
of  A.  This  contradiction  immediately  implies  that  A  is  connected. 


2.  The  fact  that  A  is  convex  follows  from  an  analogous  argument.  Assume,  for  con¬ 
tradiction,  that  A  is  not  convex.  Then  we  have  the  situation  shown  in  Fig.  2.3. 
We  can  again  select  three  colinear  points,  x,  y,  z,  as  labeled,  and  consider  traveling 
along  the  straight  line  path  specified.  We  again  have  the  problem  that  once  a  sam¬ 
ple  point  leaves  a  quantization  interval,  it  can  never  return,  and  so  x  and  2:  again 
correspond  to  different  quantized  samples.  This  contradicts  the  definition  of  A  and 
so  A  must  be  convex. 


17 


3.  To  see  that  A  is  a  poly  tope,  consider  the  boundary  between  A  and  another  neigh¬ 
boring  set  of  points  B  in  PQ-space  that  corresponds  to  a  different  set  of  quantized 
samples.  By  the  above,  both  A  and  B  are  convex,  and  so  the  boundary  must  be 
a  portion  of  a  hyperplane.  The  hyperplane  divides  PQ-space  into  two  half-spaces 
where  A  only  includes  points  from  one  of  the  half-spaces  and  B  only  includes  points 
from  the  other  half-space.  Thus,  A  is  the  intersection  of  the  half-spaces  specified  by 
all  of  the  boundary  hyperplanes.  This  intersection  forms,  by  definition,  a  polytope. 
The  polytope  is  necessarily  bounded  because  the  boundary  points  of  the  quanti¬ 
zation  region  i[n]  bound  the  range  that  i[n]  can  be  in  while  still  remaining  in  A, 
Vn.  It  is  worth  noting  that  PQ-space  is,  itself,  unbounded  (as  noted  above,  it  is 
isomorphic  to  R^),  but  the  only  polytopes  of  interest  in  PQ-space  are  those  that 
correspond  to  quantized  samples  i[n],  which  are  all  bounded. 

□ 
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One  possible  way  to  improve  the  accuracy  of  the  estimation  of  the  pk’s  and  q^’s 
from  the  p/^’s  and  is  to  use  the  fact  that  there  is  a  one-to-one  correspondence  between 
quantized  samples  i[n]  and  regions  in  PQ-space.  Moreover,  as  will  be  shown  shortly,  it 
is  possible  to  determine  the  region  from  the  quantized  samples.  Since  it  is  also  clearly 
possible  to  determine  the  quantized  samples  from  the  p^’s  and  Qi.  (using  the  DFT),  the 
Pfc’s  and  can  be  used  to  determine  the  region  in  PQ-space  that  the  true  (unquantized) 
samples  came  from.  Once  this  region  is  determined,  it  can  be  used  to  estimate  the  p^s 
and  QkS  in  several  ways.  A  particularly  simple  estimate  would  be  to  use  the  centroid 
of  the  region  of  PQ-space.  Another  technique  involves  the  use  of  additional  information 
about  the  behavior  of  real  electrical  loads.  As  observed  in  [1],  many  electrical  loads  draw 
current  profiles  that  consist  of  only  a  small  number  of  significantly  non-zero  spectral 
envelopes,  for  example  the  l®*,3''‘^,5*^^,and  7*^  (in  both  p  and  q).  If  it  is  known  that  only 
a  small  number  of  the  pk’s  and  qk’s  are  non-zero,  this  knowledge  could  be  exploited  by 
considering  only  the  intersection  of  the  region  in  PQ-space  with  the  subspace  spanned  by 
the  non-zero  Pk’s  and  q^’s  and  then  taking  the  centroid  of  the  intersection.  This  second 
estimation  technique  is  considered  here. 

To  better  understand  the  structure  of  these  regions  of  PQ-space,  consider  the 
following  concrete  example  of  a  signal  for  which  only  pi  and  ps  are  nonzero.  Let  i[n]  — 
0.51  sin(^^)-|-0.23sin(^^)  denote  A  =  16  d-bit  samples  of  a  signal.  In  practice,  of  course, 
both  the  number  of  samples  N  and  the  bit  resolution  will  generally  be  significantly  higher 
than  in  this  example;  these  values  are  chosen  to  give  a  simple  illustration  of  the  structure 
of  PQ-space.  Figure  2.4  shows  the  underlying  waveform  i(t),  the  sample  values  z[n],  the 
quantized  samples  i[n]  as  well  as  the  upper  and  lower  bounds  on  the  true  value  of  each 
i[n]  given  the  observed  i[n]  (these  are  simply  the  upper  and  lower  boundary  points  of 
the  quantization  interval  that  each  i[n]  is  mapped  to).  The  signal  i[n]  corresponds  to  a 
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Possible  True  Values  Given  Quantized  Values 


Figure  2.4:  An  example  of  quantization  boundaries  and  representation  points. 

point  in  PQ-space,  namely  the  point  (.51,. 23)  in  the  pip3-plane.  The  quantized  signal 
i[n]  corresponds  to  a  region  in  PQ-space.  This  region  is  depicted  in  Fig.  2.5.  Finally, 
Fig.  2.6  shows  a  collection  of  regions  of  PQ-space  in  the  neighborhood  of  the  above  point. 
As  can  be  seen  from  these  figures,  the  regions  in  PQ-space  are  highly  non-uniform. 

The  following  section  presents  an  algorithm  that  determines  the  region  in  PQ- 
space  corresponding  to  quantized  samples  i[n].  The  effectiveness  of  the  technique  is 
illustrated  in  Fig.  2.7.  As  was  the  case  in  the  example  shown  in  Fig.  2.1,  a  family  of  pure 
60  Hz  in-phase  sinusoids  of  varying  amplitudes,  are  sampled  with  N  =  128  4-bit  samples. 
The  estimates  produced  by  the  second  estimation  technique  discussed  above  are  marked 
with  -I-  symbols.  Again,  the  reference  line  showing  the  true  pi  value  is  shown  to  illustrate 
the  error.  Clearly,  this  method  produces  more  accurate  estimates  than  simply  using 
as  an  estimate  for  pi . 
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Figure  2.7:  Estimated  pi  vs  pi  using  the  centroid  of  the  region  of  PQ-space  corresponding 
to  observed  i[n]. 

2.3  Region  of  PQ-space  corresponding  to  quantized 
samples 

Given  a  particular  set  of  quantized  samples  f[n],  the  goal  is  to  determine  the 
corresponding  region  in  PQ-space  that  contains  all  possible  values  of  the  pk’s  and  qkS 
that  could  produce  the  observed  quantized  samples.  Call  this  region  H.  As  shown  in 
Lemma  1,  P  is  a  convex  polytope,  and  so  is  completely  determined  by  its  vertices.  Thus, 
a  potential  goal  could  be  to  determine  the  coordinates  of  these  vertices.  This  is  rather 
undesirable  due  to  the  fact  that  H  is  an  AT-polytope  and  may  have  a  large  number  of 
vertices.  Fortunately,  it  is  unnecessary  to  determine  the  vertices  of  H  because,  as  noted 
above,  we  only  require  the  intersection  of  H  with  the  (comparatively)  low  dimensional 
subspace  given  by  the  pkS  and  q^s  that  are  not  identically  zero.  Let  Z  denote  this 
subspace.  Let  W  denote  the  number  of  pkS  and  QkS  that  are  not  identically  zero  {Z  is 
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a  W  dimensional  subspace).  Let  R  denote  this  intersection.  Clearly  R  is  also  a  convex 
polytope  (because  it  is  the  intersection  of  a  convex  polytope  with  a  subspace)  but  it  is 
of  a  potentially  lower  dimension  (it  is  only  a  IL^-polytope).  Thus,  the  goal  will  be  to  find 
the  vertices  of  R. 

The  following  analysis  will  consider  only  non-empty  regions  R.  Empty  regions 
R  can  be  safely  ignored  because  they  arise  from  polytopes  H  that  do  not  intersect  Z. 
These  polytopes  are  of  no  interest  because,  by  assumption,  we  encounter  only  quantized 
samples  of  waveforms  described  by  points  in  Z. 

As  a  first  step,  notice  that  given  i[n],  it  is  easy  to  find  a  single  point  in  R.  This 
can  be  done  by  first  producing  any  set  of  unquantized  samples,  i[n],  that  quantize  to  i[n] 
(that  is  to  say,  i[n]  =  Q{i[n]),  Vn)  and  satisfy  the  constraint  that  the  point  must  lie  in 
Z.  This  is  easy  because  it  only  requires  finding  a  feasible  solution  to  a  system  of  linear 
inequalities  in  W  (not  N)  variables.  Next,  take  the  DFT  of  i[n]  to  produce  the  PkS  and 
qkS.  These  pk’s  and  qkS  specify  a  point  that  lies  in  H  because,  by  construction,  the  pk’s 
and  qk’s  correspond  to  i[n],  which  quantizes  to  i[n].  This  point  is  in  Z  and  so  the  point 
lies  in  Z  U  H  =  R. 

This  point  is  not  necessarily  the  centroid  of  R  (and  is,  of  course,  quite  unlikely 
to  be  the  centroid  of  R),  but  it  is  still  a  point  in  R.  Thus,  this  point  could  serve  as 
an  estimate  of  the  true  spectral  content  if  one  wished  to  avoid  the  extra  computational 
cost  of  determining  the  vertices  and  eventually  the  centroid  of  R.  While  this  estimate  is 
generally  not  as  good  as  using  the  centroid,  it  is,  in  general,  still  a  significant  improvement 
over  using  the  pj^'s  and  as  estimates  for  the  pk’s  and  q^s. 

Next,  an  algorithm  will  be  presented  that  determines  the  vertices  of  R  from  any 
point  in  R.  The  rough  idea  of  the  algorithm  is  to  first  find  a  single  vertex  of  R,  then 
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find  all  of  its  neighboring  vertices,  then  all  of  their  neighboring  vertices  and  so  on  until 
every  vertex  is  found.  This  algorithm  will  require  the  use  of  two  subroutines;  FIND- 
FIRST- VERTEX(y)  and  FIND-NEIGHBORS(y).  These  subroutines  will  be  described 
first. 

The  FIND-FIRST- VERTEX (y)  subroutine  finds  a  single  (arbitrary)  vertex  of  R, 
where  y  E  R.  As  noted  above,  i?  is  a  (bounded)  convex  polytope.  By  definition,  this 
means  that  R  is  the  intersection  of  a  collection  of  half-spaces.  Each  half-space  corresponds 
to  the  region  of  space  that  satisfies  the  set  of  linear  constraints  that  each  of  the  N 
points  i[n]  lie  in  the  quantization  interval  i[n\.  To  be  precise,  define  values  li, . . .  ,In  and 
ui, ...  ,Ui^  such  that  the  lower  and  upper  boundary  points  of  the  quantization  interval 
whose  representation  point  is  i[n]  are  In  and  respectively.  Then  we  have  the  following 
2N  linear  constraints: 


i[n\  >  Inj'^n 


(2.7) 


and 


i[n]  <  Un,yn.  (2-8) 

Due  to  the  fact  that  we  have  the  constraint  that  each  i[n]  must  be  less  than 
each  upper  boundary  point  but  strictly  greater  than  each  lower  boundary  point,  R  is 
neither  closed  nor  open,  because  it  contains  only  part  of  its  boundary  (the  portion  that 
corresponds  to  spectral  content  for  which  the  corresponding  i[n]  satisfies  i[j]  =  Uj,  for 
some  j).  However,  because  we  are  only  interested  in  the  vertices  of  the  polytope,  we 
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can  consider  only  the  closure  of  R  (the  smallest  closed  set  that  contains  R).  This  set 
is  defined  by  the  following  2N  linear  constraints,  which  are  identical  to  the  above  27V 
linear  constraints  with  the  exception  that  all  inequalities  have  been  made  non-strict.  In 
the  remainder  of  this  analysis,  R  will  refer  to  the  closure  of  the  original  R  defined  above. 


7[n]  >  In^Tl 


(2.9) 


and 


z[n]<u„,Vn.  (2-10) 

Interior  points  of  R  are  points  at  which  all  inequalities  are  strictly  satisfied  (no 
inequality  is  satisfied  with  equality).  The  boundary  points  of  R  are  those  points  for 
which  at  least  one  of  the  constraints  are  satisfied  with  equality.  Vertices  of  R  are  local 
maxima  for  the  number  of  constraints  satisfied  with  equality  (infinitessimal  movement 
from  a  vertex  in  any  direction  that  remains  in  R  will  decrease  the  number  of  satisfied 
constraints).  FIND-FIRST- VERTEX  locates  a  vertex  by  starting  with  any  point  in  R 
and  moving  that  point  in  such  a  way  as  to  satisfy  increasingly  many  constraints  with 
equality.  Let  the  constraints  be  numbered  1, . . . ,  27V  in  arbitrary  order,  let  Zi  denote  the 
subspace  of  R  in  which  constraint  i  is  satisfied  with  equality.  This  algorithm  is  shown  in 
pseudocode  below. 


FIND-FIRST- VERTEX(7/) 
x^y 
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S 

for  i  =  1  to  2N  do 
if  X  E  Zi  then 
s^snZi 
end  if 
end  for 
repeat 

select  arbitrary  s  E  S 

x'  X  +  ks  where  A;  >  0,  A:  is  the  smallest  value  such  that  x'  lies  in  the  boundary 
of  R  {x'  is  the  translation  of  x  that  satisfies  at  least  one  new  constraint} 
for  z  =  1  to  2N  do 

if  x'  e  Zi  and  x  ^  Zi  then 
s^snZi 
end  if 
end  for 
X  ^  x' 

until  dim{S)  =  0 

return  x 


The  algorithm  keeps  track  of  a  single  point  x,  and  a  space  S,  which  are  updated 
by  a  series  of  moves,  x  is  initialized  to  the  point  y  known  to  be  in  R  and  S  is  initialized  to 
R.  Next,  S  is  updated  to  be  the  subspace  of  R  that  is  the  intersection  of  all  Zi  for  which 
X  satisfies  constraint  i  with  equality,  by  intersecting  S  with  Z^.  The  space  S  represents 
the  space  of  directions  in  which  x  can  be  translated  such  that  every  constraint  initially 
satisfied  with  equality  is  still  satisfied  with  equality  after  the  translation.  We  then  move 
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along  some  direction  s  E  S  until  at  least  one  new  constraint  is  satisfied  with  equality,  x 
and  S  are  then  updated.  We  continue  moving  x  until  dim[S)  =  0,  which  corresponds  to 
a  point  at  which  any  infinitessimal  translation  of  x  in  any  direction  (in  B)  would  cause 
some  constraint  that  is  currently  satisfied  with  equality  to  no  longer  be  satisfied  with 
equality.  Thus,  when  dim{S)  =  0,  we  are  at  a  local  maxima  for  the  number  of  satisfied 
constraints,  and  so  a;  is  a  vertex  of  R.  Equivalently,  the  condition  dim{S)  =  0  can  be 
viewed  as  expressing  the  fact  that  the  space  that  satisfies  the  Zi  constraints  found  in 
each  iteration  is  a  0-dimensional  space  (a  point). 

The  purpose  of  FIND  —  NEIGHBORS{y)  is  to  find  all  vertices  that  are  neigh¬ 
bors  of  the  vertex  y.  By  definition,  two  vertices  x  and  y  are  neighbors  if  they  are  connected 
by  an  edge.  Thus,  we  can  find  all  neighbors  of  y  by  moving  along  each  edge  incident  to 
y  until  a  new  vertex  is  reached.  Moving  along  an  edge  is  accomplished  by  selecting  a 
constraint  that  is  satisfied  with  equality  at  y  and  relaxing  it  (allowing  it  to  be  satisfied 
with  inequality)  by  moving  along  the  edge.  The  algorithm  is  shown  in  pseudocode  below. 


FIND-NEIGHBORS(y) 

L^{] 

for  i  =  1  to  2N  do 
y  E  Zi  then 
S^SiJZi 

end  if 
end  for 

for  all  Zi  in  S  do 

if  dim{S\Zi)  ==  1  then 
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Assign  s  to  be  an  element  in  S\Zi  such  that  moving  y  in  the  direction  s  keeps 
constraint  i  satisfied. 

X  <—  y  +  ks  where  k  >  0,  k  is  the  smallest  value  such  that  x  lies  in  the  boundary 
of  R  {x  is  the  translation  of  y  that  satisfies  at  least  one  new  constraint} 

L  <—  LU  {x} 

end  if 
end  for 
return  L 


This  algorithm  makes  use  of  two  sets:  S  consists  of  the  constraints  satisfied  with 
equality  at  y,  L  is  the  set  of  neighbors  of  y  that  is  being  determined.  The  algorithm  first 
builds  S  by  checking  which  constraints  are  satisfied  with  equality.  Then,  it  finds  each 
edge  incident  on  L  by  using  the  fact  that  each  edge  is  a  one  dimensional  subspace  given 
by  S\Zi,  for  some  Zi.  It  then  finds  a  neighboring  vertex  x  by  moving  along  that  edge 
and  then  adds  x  to  L. 

Finally,  we  can  describe  the  algorithm  FIND  —  ALL  —  VERTICES{y)  which 
finds  all  vertices  of  R  given  some  y  G  R.  This  algorithm  is  shown  in  pseudocode  below. 


FIND- ALL- VERTICES  (y ) 

B  ^  {FIND  -  FIRST  -  VERTEX{y)} 
repeat 


for  all  5  e  S  do 


L  ^  L  U  {FIND  -  NEIGHBORS{b)  D  V^) 

end  for 
B^L 
until  \B\  =  0 


This  algorithm  builds  up  a  set  V  of  vertices  of  R.  It  operates  in  a  series  of  stages, 
where  in  each  stage  it  operates  on  newly  discovered  vertices,  stored  in  B.  Initially, 
B  contains  the  first  vertex  of  V ,  found  by  FIND-FIRST-VERTEX.  In  each  stage,  the 
elements  of  B  are  added  to  V .  Then,  the  set  L  is  constructed  which  consists  of  all 
neighbors  of  vertices  in  B  that  are  not  already  in  V.  Then,  B  is  set  to  L  and  the  cycle 
repeats  until  no  new  vertices  are  found.  This  process  terminates  because  there  are  a 
finite  number  of  vertices  and  each  vertex  can  only  be  newly  discovered  once.  It  finds  all 
vertices  because,  after  iteration  i,  all  vertices  that  are  at  distance  <  i  have  been  found, 
and  every  vertex  is  a  finite  number  of  steps  from  every  other  vertex  (the  graph  with  the 
vertices  and  edges  of  R  is  connected). 

A  Matlab  implementation  of  the  above  algorithm  is  included  in  Appendix  A. 


2.4  Calculations  from  regions  of  PQ-space 

The  previous  section  illustrated  how  to  find  the  vertices  of  a  polytope  R  =  H HZ , 
where  H  is  an  V-polytope  in  PQ-space  that  contains  all  points  with  the  same  quantized 
samples  i[n]  and  Z  is  slW  dimensional  subspace.  Using  the  vertices  of  R,  we  can  compute 
both  the  volume  of  R,  the  centroid  of  R,  and  the  maximum  distance  between  the  centroid 


29 


Figure  2.8;  Example  polygon. 


and  any  point  in  R.  The  centroid  is  useful  because  it  can  be  used  as  a  relatively  accurate 
prediction  of  the  true  spectral  content  of  the  unknown  z[n]  that  quantizes  to  the  known 
i[n].  The  volume  and  maximum  distance  are  useful  for  analyzing  the  accuracy  of  the 
prediction  algorithm. 

In  order  to  compute  the  volume  of  the  polytope  R,  we  can  partition  R  into  a  set 
of  simpler  polytopes,  and  take  the  sum  of  their  volumes.  Before  solving  this  problem  in 
arbitrary  dimension,  consider  the  following  example  in  2  dimensions.  Fig.  2.8  shows  a 
convex  2-polytope  (polygon)  R. 

R  can  be  partitioned  into  a  set  of  triangles  by  selecting  an  arbitrary  point  x  in 
the  interior  of  R  and  drawing  line  segments  from  x  to  each  vertex  of  R,  as  shown  in 
Fig.  2.9.  The  area  of  R  can  then  be  computed  by  computing  the  area  of  these  triangles 
and  summing  the  results. 

This  concept  can  be  generalized  to  an  arbitrary  VF-polytope  R  by  partitioning 
R  into  VF-simplexes.  A  VF- simplex  is  an  VF-dimensional  convex  poly  tope  with  W  +  1 


30 


Figure  2.9:  Example  polygon  with  interior  point. 

vertices.  It  can  be  thought  of  as  the  generalization  of  a  triangle  to  higher  dimensions. 
This  partitioning  is  accomplished  by  again  selecting  an  arbitrary  point  x  in  the  interior 
of  R  and  drawing  line  segments  to  each  vertex  of  R.  These  line  segments  are  the  edges 
of  the  VF-simplexes.  The  volume  E  of  a  IE-simplex  with  vertices  {uq,  . . .  ,vw}  is  given 
by 


E  -  —  det  t;i  -  t;o  V2-V0  ...  Vw-\-Vq  Vw-Vq  j,  (2.11) 

where  the  determinant  is  taken  on  an  IE  x  IE  whose  jth  column  consists  of  the 
elements  of  Vj  —  Vo- 

The  centroid  of  R  can  be  also  be  computed  through  a  decomposition  into  sim- 
plexes.  To  be  precise,  we  can  partition  R  into  a  set  of  simplexes,  as  above,  compute 
the  centroids  of  those  simplexes,  and  then  take  the  weighted  average  of  those  centroids 
(weighted  by  the  volume  of  the  centroid).  This  weighted  average  is  the  centroid  of  R. 
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The  centroid  (7  of  a  fT-simplex  with  vertices  {no, . . .  ,Vw}  is  given  by 


C  = 


1 

W  +  1 


w 


(2.12) 


The  maximum  distance  D  between  any  point  in  the  region  and  the  centroid  C, 
provides  a  bound  on  the  absolute  maximum  error  between  the  actual  spectral  content  of 
a  point  that  produced  quantized  samples  i[n]  and  the  estimated  spectral  content  C.  To 
determine  D,  we  can  use  the  fact  that  the  point  in  R  at  maximum  distance  from  C  will 
be  one  of  the  vertices  of  R  (because  R  is  a  bounded  polytope).  Thus,  D  is  given  by 


D=  max  \C  —  Vj\.  (2-13) 

je[o,w]' 

The  computations  discussed  above  are  only  a  sample  of  the  sort  of  computations 
possible  about  properties  of  the  region  R  using  only  the  vertices  of  that  region.  The 
determination  of  the  vertices  of  R,  using  the  algorithm  of  the  previous  section,  conve¬ 
niently  allows  the  efficient  computation  of  a  variety  of  other  useful  quantities,  such  as 
the  maximum  distance  between  any  point  in  R  and  the  centroid  and  the  expected  dis¬ 
tance  between  a  randomly  chosen  point  (according  to  some  known  distribution)  and  the 
centroid. 
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Chapter  3 


Cross  Estimation 

3.1  Introduction 

The  previous  chapter  considered  the  question  of  accurately  estimating  the  spectral 
content  (the  collection  of  p^s  and  g^’s)  of  a  signal  i[n]  given  the  spectral  content  p^.  and 
of  the  quantized  signal  i[n].  This  was  accomplished  by  using  additional  information 
about  the  structure  of  i[n],  specifically,  the  fact  that  the  true  i[n]  consists  of  non-zero 
spectral  content  at  a  (known)  limited  set  of  frequencies.  This  chapter  will  consider  the 
related  question  of  estimating  one  subset  of  p^s  and  q^s  from  knowledge  of  another 
subset.  In  general,  of  course,  nothing  at  all  can  be  said  about  the  value  of  any  particular 
Pj  or  qj  given  knowledge  of  any  other  pk’s  and  qkS  because  all  of  the  p^s  and  qk’s  are 
independent.  However,  much  as  was  the  case  in  the  previous  chapter,  there  will  often  be 
additional  information  about  the  structure  of  i[n]  that  will  allow  this  cross-estimation. 

Before  discussing  how  to  actually  perform  this  sort  of  estimation,  it  is  useful  to 
consider  a  particular  problem  that  motivates  the  desire  to  be  able  to  use  the  values  of 
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known  spectral  envelopes  to  estimate  unknown  spectral  envelopes.  In  many  settings,  the 
observed  current  signal  i[n]  will  be  an  aggregate  current  signal.  That  is  to  say, 

3 

where  each  ij[n]  is  the  current  drawn  by  a  single  electrical  load.  This  situation  arises  when 
monitoring  a  (potentially  large)  collection  of  electrical  loads  by  taking  measurements  of 
only  the  aggregate  current.  In  this  setting,  one  often  desires  to  know  the  spectral  content 
of  an  individual  ij  [n] .  If  we  let  pj^k  and  qj^k  denote  the  kth  spectral  envelope  of  ij  [n] ,  then 
by  the  linearity  of  the  DFT,  we  have 


Pk  =  =  qj,k- 

3  3 

Thus,  the  goal  here  would  be  to  use  knowledge  of  the  p^’s  and  g^’s  to  estimate  the 
individual  Pj,fc’s  and  Qj^kS.  Fortunately,  the  current  drawn  by  different  types  of  electrical 
loads  will  often  consist  of  different  sets  of  spectral  content  [1].  For  example,  consider 
the  case  when  i[n]  consists  of  the  sum  of  two  different  individual  current  signals,  ii[n] 
and  i2[n],  where  the  only  non-zero  spectral  content  of  i\[n]  is  pi^i  and  q\^\  and  the  only 
non-zero  spectral  content  of  Z2[n]  is  p2,i  and  p2,5-  Here,  the  sets  of  non-zero  spectral 
content  are  only  partially  overlapping.  Thus,  qi  =  gi,i  and  ps  =  P2,5,  and  so  knowledge  of 
the  aggregate  q\  and  ps  allows  the  corresponding  spectral  content  of  the  individual  loads 
to  be  determined.  Unfortunately,  pi  =  pi,i  +P2,i,  so  it  is  not  immediately  clear  how  to 
use  the  aggregate  value  pi  to  determine  the  individual  values  pi_i  and  p2,i.  This  chapter 
will  attempt  to  answer  this  question  by  using  the  attainable  gip  to  estimate  pi,i  and  p2,5 
to  estimate  ^2,5,  using  additional  information  about  the  structure  of  ii[n]  and  UN- 
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3.2  Usable  Constraints 


There  are  many  different  sorts  of  constraints  that  one  could  apply  to  a  signal  i[n]. 
This  chapter  will  examine  a  method  that  uses  constraints  of  the  form 


=  (3.1) 

k 

where 

~  Pk  T  i(Jk- 

Here,  for  convenience,  we  express  spectral  content  as  complex  values  Sk  rather 
than  separately  as  real  and  imaginary  components  pk  and  qk-  Each  €  '^[Cn],  where 
Cv  is  a  primitive  A^th  root  of  unity  (that  is  to  say,  Cv  =  1  but  Cn^I  for  j  <  N),  and 
I^ICn]  denotes  adjoining  Cv  to  the  integers  Z.  Thus  each  a  G  Z[Civ]  is  of  the  form 

N-l 

j=0 

where  bj  €  Z.  We  restrict  constraints  to  this  form  to  allow  an  efficient  and  accurate 
solution  method,  discussed  in  section  4  of  this  chapter,  that  exploits  the  properties  of 
cyclotomic  fields. 

While  this  family  of  constraints  certainly  doesn’t  capture  every  possible  constraint 
that  could  exist  on  a  signal  i[n],  it  is  still  a  rather  general  class  that  includes  many  useful 
constraints.  For  example,  the  constraint  Sk  =  0,  for  any  particular  k,  is  clearly  in  this 
class  (this  corresponds  to  setting  =  1  and  aj  =  0,  Vj  ^  k).  Similarly,  the  constraint 
i[j]  =  0,  for  any  particular  point  sample  j  is  in  the  class  because  the  Fourier  synthesis 
equation  expresses  i[j]  as  a  linear  combination  of  the  s/;,  where  the  coefficients  are  powers 
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i[n) 


Sample  number 


Figure  3.1:  The  current  drawn  by  a  Variable  Speed  Drive  (VSD)  over  one  line-cycle. 

of  some  Cv-  Similarly,  any  constraint  of  the  form  J2jeJ  subset  of  sample 

indices  J  and  any  Cj  G  Z[Civ]  are  also  in  this  class.  This  last  family  of  constraints  includes 
“symmetry”  constraints,  such  as  the  statement  i[j]  =  z[/]  or  i[j]  =  for  any  sample 

indices  j,  1. 

It  should  be  noted  that  we  could  have  instead  reasonably  defined  each  to  be  an 
element  of  Q[Cw]  rather  than  Z[^yv]-  However,  restricting  ourselves  to  is  without 

loss  of  generality  because  any  constraint  where  bj  G  Q  could  trivially  be  transformed  into 
a  constraint  with  only  integer  coefficients  by  multiplying  by  a  common  denominator. 

This  class  of  constraints  is  selected  because  it  permits  an  efficient  and  accurate 
solution  method,  while  still  being  general  enough  to  capture  real  world  constraints.  To 
demonstrate  the  generality  of  these  constraints,  consider  the  following  example  waveform, 
shown  in  Fig.  3.1,  that  shows  the  current  drawn  by  a  variable  speed  drive  (VSD). 


36 


This  waveform  clearly  allows  many  constraints  of  the  above  form  to  be  applied. 
For  example,  the  large  regions  of  zeros  allow  constraints  of  the  form  i[j]  =  0,  and  the 
symmetry  of  the  non-zero  regions  allow  symmetry  constraints.  Moreover,  it  is  known  [6] 
that,  for  this  waveform,  even  harmonics  and  the  so-called  “triplen”  harmonics  (multiples 
of  3)  are  approximately  zero,  which  allows  constraints  of  the  form  Sk  =  0,  for  k  a  multiple 
of  2  or  3.  For  clarity,  this  particular  concrete  example  will  be  used  throughout  the  chapter 
to  illustrate  the  various  techniques  considered. 


3.3  A  First  Attempt  at  a  Solution 

The  goal  will  be  to  express  a  single  unknown  spectral  envelope  Sr  in  terms  of  a 
collection  of  known  spectral  envelopes  Sj,  for  j  G  J .  As  usual,  we  consider  samples  i[n]  of 
some  periodic  waveform  i{t).  Unlike  in  the  previous  chapter,  in  this  situation  we  do  not 
actually  take  as  input  i[n]  but  rather  just  the  Sj,  Vj  G  J.  We  also  assume  that  we  have 
sufficient  knowledge  of  i{t)  to  generate  constraints.  Rather  than  viewing  the  number  of 
samples  per  period,  N,  as  some  fixed  value  determined  by  sampling,  here  we  can  set  N 
based  on  the  number  of  constraints  we  desire.  The  idea  is  that  we  know  the  general  form 
of  i{t),  and  so  can  correctly  write  down  a  family  of  constraints  for  any  sampled  signal 
z[n]  consisting  of  N  samples,  for  any  N.  For  example,  if  faced  with  the  current  waveform 
i{t)  of  the  VSD,  Fig.  3.1,  we  can  immediately  determine  which  indices  j  should  have  the 
constraint  i[j]  =  0,  for  any  number  of  samples  N  by  simply  checking  if  the  jth  of  N 
samples  would  land  in  a  region  of  zeros.  The  key  point  is  that  we  can  set  N  arbitrarily. 

Every  constraint  of  the  form  expressed  in  (3.1)  sets  a  linear  combination  of  the 
Sfc’s  to  0,  where  each  coefficient  ^  ^[Civ]-  Thus,  we  can  form  a  matrix  equation  of  the 
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form 


0  =  AS,  (3.2) 

where  5  is  a  vector  of  the  s^’s,  and  A  is  a  matrix  with  entries  in  Z[CAf]  where  each  row 
represents  a  single  constraint.  We  can  order  the  entries  of  S  in  any  order;  place  the 
unknown  Sr  first,  and  the  known  Sj,  j  G  J  last,  with  all  other  spectral  envelopes  in 
arbitrary  order.  With  this  ordering  of  S,  the  first  column  of  A  corresponds  to  coefficients 
multiplying  Sr  and  the  last  |J|  columns  of  A  correspond  to  coefficient  multiplying  Sj, 
j  G  J.  We  wish  to  set  A  to  be  of  size  M  x  (M  +  |  J|),  for  some  M.  This  is  desirable 
because  if  we  place  A  in  reduced  row  echelon  form  (RREF),  the  first  row  of  the  resulting 
matrix  will  express  an  equation  of  the  form 

Sr  +  CjSj  =  0,  (3.3) 

jeJ 

which  gives  an  expression  for  the  unknown  Sr  in  terms  of  the  known  Sj,  j  G  J,  as  desired, 
where  Cj  G  Q[Civ]’ 

To  assure  that  we  can  form  the  M  x  (M  +  |  J|)  matrix  A,  we  will  further  make 
the  assumption  that  i{t)  is  bandlimited,  that  is  to  say,  i(t)  contains  no  spectral  content 
outside  of  some  finite  band.  This  means  that,  for  any  N,  there  exists  a  single  constant 
No  such  that  at  most  No  of  the  Sk  of  i[n]  will  be  non-zero.  We  will  also  make  the 
assumption  that  i{t)  has  a  region  of  zeros,  some  sort  of  symmetry,  or  any  other  structure 
that  allows  a  number  of  constraints  of  the  above  form,  that  increases  with  N,  to  be 
written.  The  number  of  such  valid  constraints  grows  with  N  because,  if  for  example,  i{t) 
has  a  region  of  zeros,  then  as  N  increases,  more  and  more  sample  points  will  fall  in  that 
region.  Consequently,  as  N  increases,  the  number  of  constraints  on  i[n\  increases  but  the 
number  of  non-zero  s;  does  not  increase  past  some  finite  limit. 
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To  be  precise,  let  No  denote  the  finite  limit  on  the  number  of  non-zero  spectral 
envelopes  implied  by  the  fact  that  i{t)  is  bandlimited.  Then  consider  increasing  N, 
starting  at  A^o-  As  N  increases,  increasingly  many  Sk  become  defined,  but  all  the  “new” 
Sfc  =  0.  At  the  same  time,  we  have  increasingly  many  constraints  on  i[n].  This  means 
that  the  total  number  of  constraints  on  the  Sk  increases  with  N  faster  than  the  number 
of  defined  Sk  (every  new  Sk  introduced  comes  with  the  constraint  Sk  =  0,  but  we  also  add 
other  new  constraints,  such  as  zero  constraints  on  i[n],  or  symmetry  constraints).  Thus, 
at  some  point,  we  have  N  —  \  J\  constraints  on  the  N  spectral  envelopes.  We  can  then 
write  the  matrix  equation 

0  =  AS 


where  A  is  a  M  x  (M  -h  |  J|)  matrix,  with  M  =  N  —  \  J\. 

Many  of  the  constraints  are  simply  of  the  form  Sk  =  0,  and  so  we  can  delete 
each  column  corresponding  to  such  an  Sk,  remove  the  entry  from  S  that  corresponds  to 
Sk  and  delete  the  row  of  A  that  corresponds  to  the  constraint.  This  will  improve  the 
speed  of  subsequent  computations  on  the  matrix  by  decreasing  its  size,  without  hurting 
the  resulting  accuracy.  It  should  be  noted  that  this  idea  does  not  only  work  on  a  single 
N,  but  rather  all  sufficiently  large  N  because  as  N  increases  further,  we  only  get  more 
constraints  relative  to  the  number  of  spectral  envelopes.  Whenever  we  have  “extra” 
constraints  (that  is  to  say,  more  constraints  than  would  fit  in  a  M  x  (M  -t-  |  J|)  matrix), 
we  can  simply  not  use  the  extra  constraints.  In  particular,  we  can  choose  to  ignore 
constraints  of  the  form  Sk  =  0  when  we  have  extra  constraints.  This  is  desirable  as  the 
constraints  Sk  =  0  are  sometimes  only  approximate  because  the  waveform  i{t)  is  only 
approximately  bandlimited.  One  could  expect  accuracy  of  the  resulting  estimation  to 
increase  with  N  because  as  N  gets  larger  and  larger,  we  are  able  to  use  (relatively)  fewer 
constraints  of  the  form  sa,  —  0 
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%  error  vs  N 


N 

Figure  3.2:  Percent  error  using  S5  and  Sy  to  estimate  Si. 

This  idea  was  applied  to  samples  i[n]  of  the  current  waveform  i{t)  depicted  in 
Fig.  3.1  to  estimate  Si  from  knowledge  of  S5  and  Sy.  In  Fig.  3.2,  the  error  of  the  resulting 
estimation  is  plotted  as  a  function  of  N.  While  this  does  obtain  somewhat  low  error, 
and  the  error  decreases  with  N  initially,  as  expected,  the  error  actually  increases  without 
bound  for  very  large  N.  This  is  due  to  a  numerical  problem.  The  computation  to 
put  A  in  RREF,  by  performing  Gaussian  elimination  involves  floating  point  division  by 
numbers  that  get  smaller  with  increasing  N.  This  substantially  limits  the  effectiveness 
of  this  technique.  To  avoid  this  problem,  the  following  section  will  consider  a  different 
solution  method  in  which  all  computation  is  done,  effectively,  over  the  integers.  This  will 
completely  eliminate  numerical  problems. 
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3.4  A  Refined  Solution  Using  Cyclotomic  Fields 


As  noted  in  the  previous  section,  transforming  the  constraint  matrix  into  RREF 
by  computing  with  floating  point  arithmetic  is  numerically  unstable.  Fortunately,  this 
problem  can  be  completely  avoided  by  recognizing  that  the  elements  in  the  matrix,  ini¬ 
tially  as  well  as  at  every  step  of  the  RREF  computation,  have  a  certain  special  property: 
they  are  elements  of  a  cyclotomic  field.  A  cyclotomic  field  is  simply  an  algebraic  number 
field  generated  over  Q  (the  rationals)  by  a  primitive  root  of  nnity.  Algebraic  number 
fields  (called  number  fields  by  some  authors)  are  finite  (and  therefore  algebraic)  exten¬ 
sions  of  Q.  Elementary  properties  of  the  cyclotomic  fields  can  be  found,  for  example, 
in  [3].  The  A^*^  cyclotomic  field  is  simply  Q[Civ]  (this  denotes  adjoining  Cjv  to  Q).  Any 
element  y  G  Q[CAr]  can  be  expressed  in  the  form 

N-l 

y  —  (3-4) 

3=0 

with  Cj  G  Q.  Clearly,  every  element  of  the  matrix  A  above  is  an  element  of  Q[CAr]- 
Moreover,  since  the  cyclotomic  field  is  a  field  (with  the  usual  arithmetic  operations) 
it  is  closed  nnder  addition,  subtraction,  multiplication  and  division.  As  the  process  of 
performing  Gaussian  elimination  only  involves  these  arithmetic  operations,  we  see  that 
the  matrix,  at  any  point  during  the  computation,  will  only  consist  of  elements  from  a 
cyclotomic  field.  Clearly,  elements  of  the  cyclotomic  field  can  be  represented  exactly  (by, 
for  example,  storing  the  rational  coefficients  cj  that  define  each  element  y),  and  so  it  is 
possible  to  perform  the  computation  exactly. 

A  straightforward  way  to  perform  computations  over  the  cyclotomic  field  would 
be  to  store  the  collection  of  rational  coefficients  Cj  that  represent  a  given  element  in 
each  entry  of  the  matrix.  Then,  perform  Gaussian  elimination  as  usual,  substituting  the 
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cyclotomic  field  operations  in  place  of  arithmetic  on  scalar  quantities.  A  slightly  different 
approach  will  be  taken  here  for  computational  efficiency  reasons. 

There  is  a  natural  isomorphism  between  the  cyclotomic  field  and  Q[A’]//iv(A’), 
where  Q[A]  denotes  the  ring  of  all  polynomials  in  one  variable  with  rational  coefficients, 
and  /ivlA”)  denotes  the  cyclotomic  polynomial  [3].  The  A**'  cyclotomic  polynomial 
is  defined  by 

/kw= 

where  0  consists  of  all  primitive  roots  of  unity  in  C.  Thus,  we  can  view  each  element 
of  A  as  an  equivalence  class  of  polynomials  over  a  single  formal  parameter.  That  is  to 
say,  each  particular  element  of  A  can  be  viewed  as  the  set  of  all  polynomials  with  rational 
coefficients  that  are  equivalent,  modulo  /iv(A),  to  a  single  specific  polynomial  (this  spe¬ 
cific  polynomial  is  different  for  different  entries  of  the  matrix).  To  better  understand  this 
isomorphism,  notice  that  any  y  E  Q[C7v]  is  given  by  (3.4)  as  a  linear  combination  of  pow¬ 
ers  of  Cv-  In  some  sense,  we  can  view  y  as  being  the  value  of  a  polynomial  g{X)  G  Q[A] 
evaluated  at  Cn,  where 

JV-l 

3(x) = y]  c,xK 

j=0 

The  coefficients  of  the  polynomial  are  the  same  as  the  coefficients  used  to  define  y. 
Moreover,  we  could  view  y  as  being  the  value  of  any  polynomial  h{X)  G  Q[A]  at  Cjv 
where  h{X)  =  g{X)  -h  fjq{X)k{X),  with  k{X)  G  Q[A]  being  arbitrary  because  fniX), 
the  A*^  cyclotomic  polynomial,  has  Cjv  as  a  root.  The  family  of  h{X)  is  simply  the 
equivalence  class  of  polynomials  (in  Q[A])  that  are  congruent  to  g{X)  modulo  /jv(A). 
While  this  helps  make  clear  the  structure  of  the  isomorphism,  it  is  important  to  remember 
that  the  X  in  each  polynomial  is  only  a  formal  parameter;  it  will  not  take  any  values. 

Using  this  idea,  we  can  store  in  each  entry  of  A  an  arbitrary  lift  of  the  equivalence 
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class  of  polynomials  represented  by  that  entry  (that  is  to  say,  store  any  single  polynomial 
in  the  equivalence  class).  We  can  store  a  polynomial  by  storing  its  coefficients.  Notice 
that  /v(^)  is  of  degree  [3],  where  0(N’)  is  Euler’s  totient  function  and  is  defined  to 
be  the  the  number  of  integers  j  <  N  where  j  is  relatively  prime  to  N.  Thus,  we  have  a 
slight  reduction  in  storage  over  the  initial  scheme  of  storing  the  coefficients  expressed  in 
(3.4).  However,  since  (j>{N)  =  Q{N),  this  is  actually  not  an  asymptotic  improvement  in 
storage,  but  still  might  be  useful  in  practice.  To  perform  Gaussian  elimination,  we  simply 
replace  the  ordinary  arithmetic  operations  that  Gaussian  elimination  would  perform  on 
scalars  with  the  corresponding  operations  on  polynomials  (addition  becomes  addition  of 
coefficients,  multiplication  becomes  convolution  of  coefficients,  and  so  on)  with  the  added 
fact  that  we  perform  operations  modulo  Again,  we  see  a  slight  computational 

improvement  by  using  the  polynomial  representation  rather  than  the  initial  representa¬ 
tion  because  we  are  only  operating  on  coefficients  rather  than  N  coefficients.  This 
is  again  not  an  asymptotic  improvement,  but  might  still  be  of  value  in  practice.  In  any 
case,  addition  and  subtraction  are  0{N),  and  multiplication  and  division  are  0{N^)  by 
the  naive  algorithms.  Since  Gaussian  elimination  involves  0{N^)  arithmetic  operations, 
we  have  a  runtime  bound  of  0{N^),  which  is  still  reasonable  due  to  the  relatively  small  N 
involved.  Multiplication  and  division  will  be  improved  to  0{N  log  N)  in  the  next  section 
by  using  the  Number  Theoretic  Transform  and  properties  of  multiplying  and  dividing 
polynomials.  This  will  improve  the  runtime  bound  to  0{N'^  log  N). 

The  above  algorithm  performs  all  computations  exactly  to  produce  a  relation  of 
the  form  Sr  =  '^j^jbjSj,  where  bj  G  QCiV,  which  expresses  unknown  Sr  in  terms  of 
known  Sj.  Of  course,  actually  evaluating  Sr  from  the  Sj  will  involve  computing  this  sum 
with  floating  point  arithmetic  (because  Sj  will  likely  not  be  elements  of  but  rather 
arbitrary  values  in  C).  Fortunately,  this  only  involves  a  small  number  of  floating  point 
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%  error  vs  N 


Figure  3.3:  Percent  error  using  S5  and  Sj  to  estimate  Si  with  the  refined  method. 

additions  and  multiplications  and  so  does  not  exhibit  the  numerical  instability  of  the 
initial  solution  technique. 

This  procedure  was  applied  to  samples  z[n]  of  the  current  waveform  i{t)  depicted 
in  Fig.  3.1  to  estimate  Si  given  S5  and  Sj,  as  in  the  previous  section.  The  results  are  shown 
in  Fig.  3.3.  As  can  be  seen,  there  is  no  longer  a  numerical  instability.  The  implementation 
was  done  in  GP/PARI;  the  code  is  included  in  Appendix  B. 

3.5  Speed  Improvement  Using  the  Number  Theo¬ 
retic  Transform 

While  the  algorithm  presented  in  the  previous  section  is  sufficiently  fast  to  be 
practical,  there  is  still  room  for  improvement.  This  section  will  consider  a  method  to 
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improve  the  speed  of  the  basic  arithmetic  operations  of  multiplication  and  division.  Mul¬ 
tiplication  and  division  of  polynomials  involves  convolution  and  deconvolution,  respec¬ 
tively,  of  coefficients.  This  immediately  suggests  using  a  procedure  like  the  Fast-Fourier 
Transform  (FFT).  The  convolution  theorem  [2],  states  that,  for  x  =  xi,X2,  ■  ■  ■  ,xm  and 
y  =  yi,y2,...,yN,  we  have 


FFT{x  *y)  =  FFT{x)FFT{y). 

As  the  FFT  can  be  computed  in  0{N  log  N)  time,  this  immediately  yields  a  0{N  log  N) 
algorithm  for  multiplication  and  division  of  polynomials.  Specifically,  given  polynomials 
/(^)  =  and  g{X)  =  we  have  f{X)g{X)  =  h{X)  =  YjjhjX\  where 

fk  *  9j-k,  which  is  a  convolution.  Thus,  we  can  multiply  f{X)  and  giX)  by 
taking  the  FFT  of  the  coefficients  of  /(X)  and  g{X)  separately,  multiplying  the  FFTs 
elementwise,  and  computing  the  inverse  transform  (again,  using  an  FFT).  The  result  will 
be  the  coefficients  of  h{X).  Division  functions  in  an  analogous  fashion,  with  the  only 
modification  being  that  we  divide  FFTs  elementwise. 

An  immediate  problem  with  the  above  scheme  is  the  fact  that  it  involves  com¬ 
puting  FFTs  with  floating  point  operations.  Since  our  goal  is  perform  all  computations 
exactly,  we  would  have  to  determine  the  proper  exact  representation  of  coefficients  re¬ 
turned  by  the  above  procedure.  While  this  is,  in  principle,  possible,  it  adds  unnecessary 
extra  work.  As  an  alternative,  consider  the  Number  Theoretic  Transform  (NTT). 

Given  a  sequence  x  =  xi, ...  ,xis[  where  Xj  G  Z,  the  NTT  of  x  is  a  sequence 
X  =  Xi, . . . ,  Xn  where 

Xk  =  mod  p,  (3.5) 
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and 


k 

where  s  G  N  is  a  free  parameter  that  will  be  set  later,  p  =  sN  +  1  is  a  prime,  and  u  =  ry®, 
where  77  is  a  primitive  root  of  unity,  modulo  p.  The  NTT  also  obeys  the  convolution 
theorem,  but  has  the  advantage  that  all  computation  is  done  over  the  integers.  One  can 
also  immediately  define  a  “Fast”  NTT,  analogous  to  the  FFT,  which  uses  an  identical 
divide  and  conquer  approach  to  compute  an  NTT  in  0(iV  log  N”).  Thus,  we  can  use 
an  NTT  in  place  of  the  FFT  in  the  above  procedure  to  enable  the  fast  computation  of 
multiplication  and  division. 

To  do  this  correctly,  we  must  set  the  prime  p  to  be  larger  (by  a  factor  of  2) 
than  any  coefficient  in  the  polynomials  input  to  multiplication  and  division,  as  well  as 
any  coefficient  in  the  output  polynomial  (the  value  of  each  coefficient  in  the  output 
polynomial  can  trivially  be  bounded  in  terms  of  the  coefficients  of  the  input  polynomial) . 
This  is  done  by  setting  s  appropriately.  This  is  necessary  to  assure  that  if  we  take 
the  representative  in  of  each  congruence  class,  we  will  obtain  the  correct 

coefficient  (this  is  just  saying  there  there  is  no  ambiguity  introduced  by  working  modulo 
p]  that  is  to  say,  p  is  large  enough  so  that  knowing  the  value  of  the  coefficient  modulo 
p  immediately  yields  the  value  of  the  coefficient).  Code  to  perform  these  calculations  is 
included  in  Appendix  B. 


3.6  Ring  of  Integers  of  a  Cyclotomic  Field 

This  section  will  consider  an  alternate  scheme  for  estimating  an  unknown  Sr  in 
terms  of  known  Sj,  j  E  J.  We  begin  with  some  terminology  from  algebraic  number 
theory;  see,  for  example  [4].  Let  K  denote  a  number  field  (in  our  application,  K  will  be 
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a  cyclotomic  field,  but  the  following  definitions  apply  to  all  number  fields).  We  say  an 
element  x  £  K  integral  over  a  ring  B  if  we  have  an  equation  of  integral  dependence: 


+  . . .  +  +  6o  =  0,  (3.7) 

where  bi  G  B,  Vi.  This  is  simply  the  statement  that  x  is  a  root  of  a  monic  polynomial 
with  coefficients  in  B.  We  call  the  collection  of  elements  in  K  that  are  integral  over 
the  ring  Z  the  integers  of  K.  These  elements  form  a  ring  (with  the  usual  addition  and 
multiplication  operations),  but  not  a  field  (in  general,  we  cannot  divide  elements).  This 
ring  is  called  the  ring  of  intgers  of  K. 

For  a  cyclotomic  field,  the  ring  of  integers  is  simply  Z[('jv]  [3],  and  so  every  element 
of  the  constraint  matrix  A  (see  (3.2))  is  an  element  of  the  ring  of  integers  of  a  cyclotomic 
field.  The  previous  algorithm  uses  Gaussian  elimination  to  transform  A  into  RREF,  at 
which  point  the  first  row  of  the  matrix  corresponds  to  the  equation  Sr  +  XljG j  —  0) 
where  Cj  G  Q[Civ]-  Here,  the  idea  will  be  to  work  in  the  ring  Z[Civ]  and  ultimately  produce 
an  equation  d'sr  +  djSj  =  0,  where  dj  G  Z[^jv],  \/j  G  J  and  d'  G  ZlCv]- 

Of  course,  we  cannot  simply  use  Gaussian  elimination  because  that  requires  di¬ 
vision,  which  cannot  (in  general)  be  done  in  a  ring.  The  idea  will  be  to  use  a  similar 
process  where  we  skip  the  step  of  dividing  a  row  by  its  leading  element  (this  is  the  only 
step  of  Gaussian  elimination  that  involves  division).  To  be  precise,  in  ordinary  Gaussian 
elimination,  we  operate  column  by  column,  transforming  the  matrix  so  that  each  column 
has  only  a  single  1  and  all  other  entries  0.  For  each  column,  we  select  a  row  with  a 
non-zero  element  in  that  column  to  operate  on,  call  this  row  m.  The  sequence  of  steps 
performed  by  standard  Gaussian  elimination,  for  row  m,  is  shown  in  pseudocode  below. 
In  the  following,  let  leading{m)  return  the  index  of  the  first  non-zero  entry  in  row  m,  let 
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R  and  C  denote  the  number  of  rows  and  columns,  respectively,  of  the  matrix,  and  let  Uij 
denote  the  entry  in  row  i  column  j. 


c  leading{m) 

P  ^ 

for  j  =  1  to  C  do 

j p 

end  for 

for  all  i  E  [1,  i?]  \  {m}  do 

Q  ^  Ujc 

for  J  =  1  to  C  do 

dij  (Xij  ^mjQ 

end  for 
end  for 

This  will  be  modihed  to  the  following: 


c  •«—  leading{m) 

P  ^Tnc 

for  all  i  G  [1,  i?]  \  {m  }  do 

q  ^  aic 

for  j  =  1  to  (7  do 

^ij  ^ijP  (^mjq 

end  for 
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end  for 


In  the  original  Gaussian  elimination  algorithm,  we  operate  on  row  m  by  first 
dividing  each  entry  of  row  m  by  the  leading  element,  then,  for  every  row  i  ^  m,  we 
subtract  a  multiple  q  of  row  m  from  row  i,  where  q  is  simply  aic,  the  element  in  row  i  in 
the  same  column  as  the  leading  element  of  row  m.  This  has  the  effect  of  clearing  column 
c,  except  for  amc,  which  is  set  to  1.  Every  step  of  this  process  preserves  the  validity 
of  the  system  of  equations  because  we  are  only  either  dividing  a  row  by  a  constant  or 
subtracting  a  multiple  of  one  row  from  another  row.  This  indeed  preserves  the  validity 
of  the  system  of  equations  represented  by  the  matrix  because  each  row  i  of  the  matrix 
corresponds  to  an  equation  UijSj  =  0,  and  thus  dividing  by  a  constant  only  divides  all 
coefficients  by  the  same  constant;  similarly,  subtracting  a  multiple  of  one  row  to  another 
corresponds  to  subtracting  a  multiple  of  one  equation  from  another.  The  only  changes 
are  that  we  now  do  not  divide  row  m  by  its  leading  element  but  instead  multiply  each 
row  i  by  the  leading  element  of  row  m  and  then  subtract  the  same  multiple  q  of  row  m 
from  row  i.  We  are  allowed  to  multiply  row  i  by  the  leading  element  of  row  m  (or,  in  fact, 
by  any  constant)  because  again  row  i  represents  an  equation  of  the  form  aijSj  —  0, 
which  is  unaffected  by  multiplying  the  coefficients  by  any  non-zero  constant. 

All  operations  in  this  new  scheme  can  be  done  over  a  ring,  and  so  the  above  algo¬ 
rithm  could  indeed  be  used  to  produce  a  relation  d' Sr  +  Y2j^jdjSj  =  0,  as  desired.  One 
significant  problem,  however,  is  that  if  the  above  algorithm  is  used  as  described,  the  coef¬ 
ficients  of  the  polynomials  stored  in  each  entry  of  the  matrix  will  become  extremely  large. 
In  such  a  situation,  it  will  no  longer  be  appropriate  to  treat  the  individual  arithmetic  op¬ 
erations  on  coefficients  as  0(1)  (these  are  the  operations  discussed  above  to  compute  the 
coefficients  of  a  polynomial  that  results  from  the  addition  or  multiplication  of  two  other 
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polynomials).  Essentially,  this  problem  occurs  because,  as  noted  above,  multiplying  each 
row  by  any  non-zero  value  does  not  change  the  equation  the  row  defines.  In  the  above 
procedure,  each  row,  in  some  sense,  accumulates  extraneous  multiplying  factors.  It  will 
be  desirable  to  remove  these  factors  during  the  computation  and  thereby  “simplify”  each 
row. 

One  obvious  idea  would  be  to  divide  each  row  by  the  greatest  common  divisor 
(GCD)  of  all  the  elements  (where  here  the  elements  are  equivalence  classes  of  polyno¬ 
mials,  or  equivalently  elements  of  Z[Civ])-  Despite  the  fact  that  we  cannot,  in  general, 
divide  elements  in  a  ring,  we  can  still  certainly  divide  any  element  by  one  of  its  divisors. 
However,  we  still  encounter  difficultly  in  computing  the  GCD. 

Over  the  integers,  one  can  compute  the  GCD  using  Euclid’s  algorithm.  The 
integers  are  a  ring.  Euclid’s  algorithm  can  be  generalized  to  many  other  rings,  which  all 
called  Euclidean  domains.  This  includes  the  rings  of  integers  of  many  number  fields.  To 
determine  if  Euclid’s  algorithm  can  be  extended  to  the  ring  of  integers  of  a  particular 
cyclotomic  field,  we  must  first  introduce  a  bit  more  terminology,  see  [4],  An  integral 
domain  is  a  ring  with  more  than  one  element  that  has  no  zero  divisors.  An  ideal  A  of 
a  ring  i?  is  a  subset  of  R  such  that  if  ai ,  02  €  A  and  r  E  R,  v/e  have  ai  +  02  E  A  and 
rai  E  A  (that  is  to  say,  it  is  closed  under  addition,  and  also  under  multiplication  by  any 
element  in  R).  An  ideal  is  thus  clearly  also  a  ring.  We  say  an  ideal  A  is  generated  by 
elements  gi, gk  E  R  if  A  is  the  intersection  of  all  ideals  in  R  that  contain  gi, ...  ,gk. 
A  principal  ideal  is  an  ideal  generated  by  a  single  element.  A  principal  ideal  domain  is 
an  integral  domain  that  only  has  principal  ideals. 

We  can  use  the  fact  that  a  ring  of  integers  of  a  number  field  is  a  Euclidean  domain 
if  and  only  if  it  is  a  principal  ideal  domain  (PID)  [4].  Unfortunately,  as  shown  in  [5], 
the  set  of  N  such  that  the  ring  of  integers  of  the  A^th  cyclotomic  field  is  a  PID  is  finite. 
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In  fact,  for  N->  90,  the  ring  of  integers  of  a  cyclotomic  field  is  never  a  PID  [5].  This 
eliminates  the  possibility  of  using  Euclid’s  algorithm. 

In  some  sense,  the  failure  to  be  a  PID  can  be  viewed  as  the  failure  for  unique 
factorization  to  hold.  For  the  integers,  and  more  generally  for  any  PID,  we  have  the 
Fundamental  Theorem  of  Arithmetic  which  states  that  every  element  can  be  uniquely 
factored  into  a  product  of  powers  of  primes  and  a  unit,  where  a  unit  is  simply  an  invertible 
element  (±1  in  Z)  and  a  prime  is  an  indecomposable  element  (the  usual  primes  in  Z). 
While  we  cannot  uniquely  factor  elements  of  a  non-PID  ring  of  integers  of  a  cyclotomic 
field,  we  can  accomplish  the  same  goal  by  factoring  ideals. 

We  make  use  of  Dedekind’s  Theorem  [4]  which  states  that  in  the  ring  of  integers 
of  any  number  field,  we  can  uniquely  factor  any  ideal  into  the  product  of  powers  of  prime 
ideals.  In  fact,  this  holds  for  a  more  general  class  of  rings  called  Dedekind  rings,  but  this 
is  not  needed  here.  The  idea  will  then  be  to  factor  out  common  terms  from  a  row  of  the 
matrix  by,  for  each  element  in  the  row,  computing  the  principal  ideal  generated  by  that 
element,  adding  all  the  principal  ideals  together,  and,  if  the  result  is  a  principal  ideal, 
taking  its  generator.  Every  element  of  the  row  will  be  divisible  by  this  generator,  and  so 
we  can  divide  out  that  common  factor.  This  algorithm  is  discussed  in  detail  in  [7]. 

This  simplification  procedure  allows  the  above  algorithm  to  be  used  as  an  alterna¬ 
tive  way  to  compute  a  relation  between  an  unknown  Sr  and  known  Sj,  j  E.  J.  Both  this 
algorithm  and  the  original  algorithm  compute  this  relation  exactly,  and  so  the  accuracy 
of  both  algorithms  is  identical. 
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Chapter  4 


Classification 

4.1  Fundamental  Problem 

The  goal  of  a  classification  algorithm  is  to  determine  when  each  load  in  a  collection 
of  electrical  loads  turns  on  and  off.  The  data  used  to  make  this  determination  is  the 
aggregate  current  drawn  by  the  collection  of  loads  and  the  line  voltage  supplied  to  the 
loads.  To  begin  to  develop  such  an  algorithm,  a  simpler  fundamental  problem  will  be 
examined  first.  Consider  a  black-box  that  contains  a  single,  unknown  electrical  load 
drawn  from  a  collection  of  electrical  loads.  The  goal  is  to  determine  which  load  is  in  the 
black-box  by  examining  the  current  drawn  by  that  load  when  the  unknown  device  is  first 
turned  on. 

To  be  precise,  let  L  =  {Zi, . . . ,  Im}  denote  a  set  of  M  electrical  loads.  A  single 
load,  I  is  selected  from  L  according  to  some  probability  mass  function  pi{lj)  =  Pr[l  =  Ij]] 
that  is  to  say,  pi{lj)  denotes  the  probability  that  load  Ij  is  selected.  At  this  stage,  pi{lj) 
will  be  assumed  to  be  known.  Let  I  —  (Zq,  . . . ,  iv-i)  denote  the  ordered  A-tuple  of 
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current  samples  drawn  by  the  load  I  when  it  is  turned  on.  These  samples  are  collected 
uniformly  in  time,  with  n  samples  per  period  of  the  line  voltage.  It  should  be  noted  that 
/  is  a  truncated  version  of  the  infinitely  long  vector  of  current  samples  that  would  be 
obtained  by  sampling  the  current  waveform  for  all  time.  It  will  be  assumed  that  that  is 
some  sufficiently  long  period  of  time  such  that  ceasing  sampling  after  this  period  of  time 
will  not  cause  the  loss  of  any  identifying  information;  that  is  to  say,  all  relevant  features 
of  the  current  samples  are  contained  in  finitely  many  of  the  infinite  collection  of  current 
samples. 

The  classification  algorithm,  A,  takes  input  L  and  /  and  produces  a  prediction  1 
of  the  identity  of  the  load  1.  The  goal  is  to  maximize  the  probability  that  1  =  1,  which 
is,  by  definition,  accomplished  by  setting 

I  =  argmaxi.^LPr[l  = 

Using  elementary  probability,  this  can  be  rearranged  as 

Pr\l  =  l\ 

I  =  argmaxi.^LPr[I\l  =  Ij]  •  (4.1) 

In  the  above  equation,  Pr[l  =  Ij]  is  simply  the  apriori  probability  that  load  Ij  is  in 
the  box,  which  is  given  by  pi{lj).  Pr[I\l  =  Ij]  is  simply  the  probability  of  generating  the 
current  iV-tuple  I  given  the  device  Ij.  It  should  be  noted  that,  even  for  a  single  fixed  load 
Ij,  many  diff'erent  current  A^-tuples  I  could  be  measured  due  to  noise  and  other  factors 
(for  example,  a  real  electrical  load  might  draw  different  current  waveforms  in  warm  and 
cold  environments).  Pr[I]  is  the  apriori  probability  that  I  will  be  the  observed  data. 
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which  is  given  by 


Pr[/l  =  X^Pr[/|i  =  yPr[(  =  y. 

3 

Thus,  in  principle,  the  classification  algorithm  is  quite  trivial.  Given  a  sufficiently 
detailed  model  of  the  electrical  characteristics  of  every  load,  one  could  calculate  Pr[I\l  = 
Ij]  for  any  load  Ij  E  L  and,  using  (4.1),  make  the  best  possible  prediction  I  of  1.  In 
practice,  however,  such  a  detailed  model  of  the  underlying  physical  properties  of  the 
loads  is  often  unavailable.  That  is  to  say,  while  determining  the  physical  properties  of 
the  devices  may  be  possible,  it  is  likely  undesirable  to  perform  a  detailed  analysis  of  a 
load  before  the  algorithm  could  be  used  to  identify  that  load.  An  alternative  to  having 
these  detailed  models,  considered  in  the  next  section,  is  to  construct  a  simplified  model 
using  observed  data. 


4.2  Device  Modeling 

As  shown  in  the  previous  section,  the  development  of  a  classification  algorithm 
requires  certain  pieces  of  information  about  the  collection  of  electrical  loads.  In  particular, 
it  is  necessary  to  know  the  conditional  probability  distribution  of  current,  Pr[I\l  =  Ij], 
for  each  Ij  €  L  and  every  possible  current  vector  I,  as  well  as  the  probabilities  Pi{lj) 
that  each  load  Ij  E  L  will  be  the  load  in  the  black-box  (which,  up  to  this  point,  has 
been  assumed  to  be  known).  While  it  is  true  that  Pr[I\l  =  Ij]  could  be  calculated  from 
sufficiently  detailed  apriori  knowledge  of  the  electrical  characteristics  of  load  Ij,  it  is, 
as  noted  in  the  previous  section,  often  impractical  to  do  so.  This  section  will  consider 
methods  to  estimate  these  unknown  probabilities  experimentally. 

Consider  a  modified  version  of  the  black-box  load  problem  stated  above.  In  this 
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modified  version  of  the  problem,  there  are  two  phases:  a  training  phase  and  a  classification 
phase.  In  the  training  phase,  data  is  gathered  on  the  loads  that  will  be  used  to  estimate 
the  unknown  probabilities.  In  the  classification  phase,  these  estimated  probabilities  will 
be  used  in  classification  algorithm  of  the  previous  section  to  classify  an  unknown  load  in 
the  black-box.  To  be  precise,  the  training  phase  will  consist  of  some  large  number  K  of 
trials.  In  each  trial,  a  load  I  e  L  is  selected  according  the  probability  distribution 
the  load  is  then  turned  on,  and  data  is  collected.  Throughout  the  training  phase,  the 
loads  are  not  operating  in  a  black-box;  that  is  to  say,  the  identity  of  each  load  selected 
is  known. 

The  data  from  the  training  phase  can  then  be  used  to  estimate  the  probabilities 
Pr[I\l  =  Ij]  for  each  load  Ij  €  T.  A  straightforward,  but  entirely  impractical,  way  to 
do  this  would  be  simply  count  the  number  of  times  that  any  particular  load  I  produced 
the  current  vector  /,  which  will  be  denoted  Kij,  and  the  number  of  times  that  load  I 
was  selected,  which  will  be  denoted  Ki,  and  then  estimate  Pr[I\l  —  Ij]  by  the  quantity 
This  is  completely  impractical  because,  even  though  the  number  of  possible  current 
vectors  is  finite,  it  is  extremely  large.  If  each  of  the  N  samples  of  current  is  taken  to  b 
bits  of  precision,  then  there  are  2^^  possible  values  of  I.  It  would  not  be  reasonable  to 
even  store  2^^  estimates  for  each  of  the  M  loads,  much  less  actually  produce  all  of  the 
estimates. 

A  more  practical  approach  is  to  assume  that  the  distributions  Pr[I\l  =  Ij]  have 
some  simple  functional  form  with  only  a  few  unknown  parameters.  A  particularly  useful 
form  arises  from  assuming  that  each  load  Ij  has  a  corresponding  characteristic  current 
vector  Ij  that  it  would  draw  under  ideal  circumstances.  That  is  to  say,  in  the  complete 
absence  of  noise  and  measurement  inaccuracy,  if  device  Ij  were  turned  on,  the  measured 
current  would  be  Ij.  Noise  will  be  modeled  as  additive  white  Gaussian  noise.  Without 
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loss  of  generality,  this  noise  can  be  assumed  to  be  zero-mean,  because  if  the  noise  had  a 
non-zero  mean,  the  mean  could  be  estimated  accurately  over  some  long  period  of  time 
and  subtracted  out.  In  this  setting,  only  the  characteristic  current  vectors  {Ij}  for  each 
of  the  loads  Ij  G  L,  as  well  as  the  variance  of  the  noise  (a  Gaussian  is  completely 
determined  by  its  mean  and  variance)  needs  to  be  estimated.  This  reduces  the  task  of 
estimating  parameters  to  estimating  only  MN  +  1  parameters. 

Additionally,  it  should  be  noted  that,  in  this  setting,  it  is  no  longer  necessary  to 
assume  that  the  probability  mass  function  pi{lj)  is  known.  This  is  because  Pi{lj)  can 
be  approximated  by  the  proportion  of  trials  in  the  training  phase  in  which  load  Ij 
appears. 


4.3  Spectral  Envelopes 

Thus  far,  the  problem  of  classification  has  only  been  considered  using  raw  current 
as  the  source  of  data.  This  section  will  explore  classification  algorithms  that  instead 
make  use  of  the  Discrete  Fourier  Transform  (DFT)  of  the  raw  current  and  the  spectral 
envelope  representation.  For  notational  convenience,  this  chapter  will  use  a  slightly 
different  definition  of  spectral  envelopes  than  the  original  definition  given  in  Chapter  2. 
In  particular,  a  complex  form  of  spectral  envelopes  is  defined.  All  results  in  this  chapter 
would  apply  equally  well  to  spectral  envelopes  given  by  the  original  definition. 

The  DFT  is  defined  as  follows.  Given  a  sequence  of  n  complex  numbers,  Xq,  •  •  ■ ,  Xn-i , 
the  DFT  transforms  this  sequence  into  the  n  complex  numbers  Xq,  . . . ,  X„_i  where 

Xk  y^Xje~^^K 

j=o 
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Similarly,  the  inverse  Discrete  Fourier  Transform  (IDFT),  which  expresses  the  original 
sequence  in  terms  of  the  transformed  sequence,  is  given  by 


^  n—l 


k=0 


Using  this  definition  we  can  form  spectral  envelopes  So{t),. . . ,  Sn-i{t)  which  are 
defined  by 

(4.2) 

j=0 

Thus,  the  fcth  spectral  envelope  at  time  t  is  simply  the  fcth  term  in  the  DFT  of  the 
sequence  it,  ■■■ ,  it+n,  which  are  the  samples  of  current  over  one  line-cycle  worth  of  time 
(there  are  n  samples  of  current  taken  per  line  cycle  of  voltage),  rotated  by  an  appropriate 
factor.  It  should  be  noted  that,  because  t  can  take  any  (integral)  value,  the  sequence  of 
n  samples  used  to  produce  Sk{t)  at  any  particular  time  t  will  not  necessarily  start  or  end 
at  zero-crossings  of  the  line  voltage. 

The  spectral  envelopes  are  defined  in  this  way  so  that  they  will  correspond  to 
meaningful  physical  quantities.  For  example,  under  the  assumption  of  a  stiff,  harmon¬ 
ically  pure  line  voltage,  S{5'i(t)}  corresponds  to  real  power,  3?{S'i(t)}  corresponds  to 
reactive  power,  $J{52(t)}  corresponds  to  power  drawn  at  the  second  real  harmonic,  and 
so  on. 

In  considering  developing  a  classification  algorithm  that  uses  spectral  envelope 
data,  instead  of  raw  current  data,  an  immediate  question  is  the  accuracy  of  the  resulting 
classifier.  To  frame  this  question  properly,  consider  a  slightly  modified  version  of  the 
black-box  experiment  from  Section  1.  Again,  a  randomly  selected  load  Z  G  L  is  placed  in 
a  black-box  and  data  is  recorded  when  this  load  is  turned  on.  The  change  is  that  now. 
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instead  of  recording  I,  the  sequence  of  N  current  samples,  we  record  the  power  spectral 
envelopes. 

Consider  a  classification  algorithm  E  that  takes  as  input  L  and  So{t), ,  Sn-i{t) 
for  t  G  [0,  iV  —  n]  (the  input  consists  of  the  set  L  and  values  of  each  of  the  power  spectral 
envelopes  at  each  point  in  time)  and  outputs  a  prediction  I  of  the  identity  of  the  load  I 
with  the  goal  of  maximizing  the  probability  that  1  =  1.  Clearly,  the  best  prediction  that 
E  can  make  is 

I  =  argmaxi.^LPf'\l  =  G  [0,  n  —  1],  t  G  [0,  —  n]}]. 

The  natural  question  to  ask  is  which  of  the  two  predictions  (made  by  each  of  the 
two  algorithms  A  and  E)  has  the  higher  probability  of  being  correct.  The  answer  to  this 
question  is  that  the  predictions  made  by  A  and  the  predictions  made  by  E  have,  in  all 
cases,  the  same  probability  of  being  correct.  This  is  shown  by  the  following  lemma  that 
relies  on  the  fact  that  both  A  and  E  are,  by  definition,  optimal  algorithms.  That  is  to 
say,  they  produce  the  prediction  that  is  most  likely  given  their  input  (L  and  /  in  the  case 
of  A  and  L  and  {*S'j(t)|j  G  [0,  n  —  1],  t  G  [0,  Af  —  n]}  in  the  case  of  E). 

Lemma  2.  For  algorithms  A  and  E,  as  defined  above,  let  I  a  andls  denote  the  predictions 
made  by  algorithms  A  and  E,  respectively.  Then,  Pr'^A  =  1]  =  Pr\lE  =  1]  always. 

Proof.  Assume  that  there  is  some  case  for  which  the  prediction  of  E  is  better  (more  likely 
to  be  correct)  than  the  prediction  of  A.  It  is  immediately  clear  that  the  input  to  E  can 
be  calculated  from  the  input  to  A  (by  applying  equation  2  above).  Thus,  it  is  possible  to 
construct  a  third  algorithm  C  which  takes  input  L  and  /  (the  same  input  that  A  takes), 
produces  {-Sj(t)|j  G  [0, n  —  l],t  G  [0,N  —  n]},  runs  E,  and  outputs  the  prediction  made 
by  E.  By  the  assumption  that  E  will,  in  some  case,  return  a  better  prediction  than  A,  it 
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follows  that  C  would  return  a  better  prediction  than  A,  which  contradicts  the  optimality 
of  A.  This  contradiction  immediately  implies  that  there  is  no  case  in  which  algorithm  E 
could  produce  more  accurate  predictions  than  algorithm  A. 

Similarly,  assume  that  there  is  some  case  for  which  the  prediction  of  A  is  better 
than  the  prediction  of  E.  Again,  it  is  possible  to  produce  the  input  to  A  from  the 
input  to  E.  This  is  due  to  the  fact  that  the  DFT  is  invertible;  thus,  given  knowledge  of 
So{t), . . . ,  Sn-i{t)  for  all  time  the  vector  /  can  be  produced.  Therefore,  there  exists  an 
algorithm  D  which,  on  input  L  and  E  [0,  n  -  l],t  G  [0,  iV  -  n]}  could  produce 

a  better  prediction  than  E  by  using  A,  which  contradicts  the  optimality  of  algorithm 
E.  This  contradiction  implies  that  there  is  also  no  case  for  which  the  prediction  of  E  is 
better  than  the  prediction  of  A.  □ 

Given  that  algorithm  E  performs  exactly  as  well  as  algorithm  A,  the  natural  next 
question  to  ask  is  whether  there  is  any  motivation  in  recording  power  spectral  envelope 
data  instead  of  raw  current  data.  As  will  soon  be  demonstrated,  a  potential  motivation 
is  compression.  To  see  this,  consider  the  total  amount  of  storage  space  needed  for  the 
input  to  algorithm  A  and  the  input  to  algorithm  E.  In  both  cases,  the  input  includes  L, 
and  so  this  portion  of  the  storage  space  requirement  is  the  same  in  both  cases.  The  other 
part  of  the  input  for  algorithm  A  is  the  current  vector  /  which  consists  of  N  samples, 
each  of  which  are  taken  to  a  precision  of  b  bits,  and  so  storing  the  vector  I  requires  Nb 
bits.  Thus,  in  order  for  power  spectral  envelopes  to  be  useful  as  a  form  of  compression, 
they  must  require  fewer  than  Nb  bits  to  store. 

Unfortunately,  simply  storing  all  of  the  spectral  envelopes  used  as  input  to  E 
would  take  considerably  more  space  to  store.  Each  of  the  n  spectral  envelopes  must  be 
recorded  at  N  —  n  +  1  values  of  t]  if  2b  bits  are  used  to  record  each  of  these  spectral 
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envelope  values  (6  bi'ts  for  each  of  the  real  and  imaginary  parts  of  the  spectral  envelopes) 
then  a  total  of  2bn{N  —  n  4- 1)  would  be  required. 

However,  there  is  still  hope  due  to  the  tremendous  redundancy  in  the  spectral 
envelopes.  That  is  to  say,  despite  the  fact  that  so  many  bits  are  used  to  store  the 
spectral  envelopes,  their  actual  information  content  is  much  smaller.  To  see  this,  recall 
that  the  power  spectral  envelopes  can  be  determined  from  the  current  vector  I.  Thus,  it  is 
never  necessary  to  store  more  power  spectral  envelope  values  than  are  needed  to  uniquely 
determine  I.  This  allows  two  types  of  redundancies  to  immediately  be  exploited.  Firstly, 
since  I  is  real  (it  consists  of  measured  current  values,  and  has  no  imaginary  part),  it  must 
be  the  case  that  Xj  =  X^_j,  where  *  denotes  conjugation,  (this  is  simply  a  property  of 
the  DFT)  and  so  Sj{t)  =  Thus,  knowledge  of  {Sj(t)|j  G  [0,  |  —  l],t  € 

[0,  —  n]}  would  sufhce.  The  second  sort  of  redundancy  arises  from  the  fact  that  the 

DFT  is  invertible.  Thus,  given  knowledge  of  the  DFT  of  n  sequential  values  of  current 
(for  example,  from  time  ttot  +  n—1,  as  is  used  to  construct  {6'j(t)|j  G  [0,  n—  1]})  would 
suffice  to  reconstruct  those  n  values  of  current.  Thus,  it  is  only  necessary  to  record  the 
Sj{t)  at  a  spacing  of  n  in  time.  Combining  both  of  these  redundancies,  only  half  of  the 
n  spectral  envelopes  need  to  be  recorded,  and  only  at  ^  points  in  time.  If  each  of  the 
spectral  envelopes  are  recorded  to  2b  bits  of  precision  {b  bits  for  the  real  part  and  b  bits 
for  the  imaginary  part)  this  will  require  exactly  Nb  bits,  the  same  amount  of  storage 
space  necessary  to  store  the  raw  current  values. 

This  has  not  accomplished  any  sort  of  compression  because  the  same  amount  of 
storage  space  is  needed  to  store  spectral  envelope  values  as  was  needed  to  store  raw 
current  values.  In  order  to  achieve  actual  compression,  fewer  spectral  envelope  values 
must  be  stored.  It  can  be  hoped  that  there  is  some  additional  redundancy  in  the  data 
produced  by  real  electrical  loads  that  would  allow  fewer  spectral  envelope  values  to  be 
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stored  without  loss  of  information. 


As  observed  earlier,  the  power  spectral  envelopes  correspond  to  real  physical  quan¬ 
tities  (i.e.  corresponds  to  real  power).  Thus,  one  could  hope  that  it  is  the  case 

that  real  electrical  loads  only  draw  power  at  a  few  harmonics,  and  so  only  a  few  of  the 
spectral  envelopes  would  be  non-zero.  If  this  were  indeed  the  case,  then  a  large  amount 
of  storage  space  could  be  saved  by  not  storing  envelopes  with  a  constant  0  value.  Unfor¬ 
tunately,  as  will  soon  be  shown,  even  if  only  a  few  spectral  envelopes  are  non-zero  during 
steady-state  portions  of  a  waveform,  many  spectral  envelopes  will  be  non-zero  during 
transient  events. 

Consider  the  following  simple  current  vector  shown  below,  in  Fig.  4.1. 


Figure  4.1:  Current  Vector. 

This  current  vector  I  consists  of  samples  over  two  cycles  of  the  line  voltage.  It  is 
given  by  the  equation 

bm(^()  «€[0,1] 

-‘(.C  “■  \ 

|^2sin(^t)  te(l,2] 

For  clarity.  Figure  2,  below,  shows  views  of  this  current  waveform  over  three  windows. 
The  three  windows  are  each  of  length  one  cycle  {n  data  points)  and  start  at  an  offset  of 
0,|  and  1  cycles,  respectively. 
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Figure  4.2:  Current  Vector  split  into  windows. 
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In  this  example,  it  can  easily  be  shown  that  Si,Sn-i  and  {52^1^  G  [0,  |  —  1]} 
are  all  not  constantly  equal  to  0.  In  particular,  S'i(O)  ^  0,  S'„_i(0)  ^  0  and  S2k{%)  ^ 
0,  V/c  G  [0, 1  —  1].  Despite  the  fact  that  this  example  is  extremely  specific,  it  illustrates 
an  important  general  phenomenon.  The  current  vector  /  is  a  piecewise  function  where, 
over  each  cycle,  it  consists  of  a  single  pure  harmonic.  Therefore,  if  we  examine  the  power 
spectral  envelopes  at  times  given  by  the  start  of  cycles  of  the  line  voltages  (that  is  to 
say,  at  times  when  the  power  spectral  envelope  is  calculated  from  samples  all  within  the 
sample  line  cycle)  then  only  two  harmonics  will  be  non-zero  (only  one  of  which  needs  to 
be  recorded  as  each  of  these  harmonics  could  be  calculated  from  the  other).  However,  if 
we  examine  the  spectral  envelopes  at  times  given  by  the  middle  of  line  cycles  (when  half 
of  the  current  values  used  to  calculate  the  spectral  envelopes  come  from  two  adjacent 
cycles)  then  we  see  that  all  harmonics  whose  parity  differs  from  the  single  pure  harmonic 
are  non-zero. 

Despite  the  fact  that,  as  shown  above,  many  spectral  envelopes  may  not  be  iden¬ 
tically  equal  to  0,  this  does  not  preclude  the  possibility  that  the  vast  majority  of  spectral 
envelopes  are  0  at  all  points  of  interest.  Recall  that,  as  noted  above,  there  is  a  great 
deal  of  redundancy  in  the  spectral  envelopes.  In  particular,  it  is  only  necessary  to  store 
spectral  envelope  data  at  a  spacing  of  n  in  time.  Thus,  if  it  were  the  case  that  any  spec¬ 
tral  envelope  satisfied  Sj{kn)  =  0,Vk  E  [0,  ^  —  1],  then  storage  space  could  be  saved  by 
not  recording  spectral  envelope  Sj.  For  example,  in  the  example  above,  the  only  spectral 
envelope  that  isn’t  equal  to  0  at  points  in  time  that  are  multiples  of  n  is  ^i.  Thus,  this 
current  vector  could  be  stored  (with  no  loss  of  information)  by  only  storing  Si. 

More  generally,  consider  any  current  vector  I  that  consists  of  full-period  piecewise 
combinations  of  the  harmonics  ki,. . .  ,kp.  That  is  to  say,  over  each  cycle  of  the  line  voltage 
{n  points  of  data)  the  current  vector  can  be  written  as  a  linear  combination  of  the  complex 
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exponentials  Then,  clearly,  at  the  points  in  time  {kn\k  £  [0,  ^  —  1]}, 

the  only  non-zero  spectral  envelopes  will  be  Sk-^,  ■  ■  ■ ,  Skp-  Therefore,  in  such  a  case, 
storing  the  sufficient  set  of  spectral  envelopes  (the  set  necessary  to  compute  all  spectral 
envelope  values)  only  takes  ^Nb  bits  to  store,  which  could  be  much  less  than  that  Nb 
bits  needed  to  store  all  raw  current  values  if  p  is  much  smaller  than  n  (that  is  to  say,  if 
only  a  small  portion  of  the  possible  harmonics  are  actually  used) . 

This  same  notion  can  be  extended  to  a  wider  class  of  current  vectors.  Consider  any 
current  vector  I  that  is  a  half-period  piecewise  combination  of  the  harmonics  ki, ...  ,kp. 
That  is  to  say,  over  each  half-cycle  of  the  line  voltage  (|  points  of  data),  I  can  be  written 
as  a  linear  combination  of  the  complex  exponentials  . . . ,  Then,  as  shown 

in  the  following  lemma,  there  are  many  cases  in  which  recording  the  spectral  envelopes 
S'fcj , . . . ,  S'fcp  only  at  the  points  {k^\k  E  [0,  (plus  a  negligible  amount  of  side 

information)  would  suffice  to  completely  determine  all  spectral  envelope  values.  This 
would  take  only  bits  (plus  a  negligible  amount  to  store  side  information),  which 

again  could  be  much  less  than  the  Nb  bits  needed  to  directly  store  all  raw  current  values. 

Lemma  3.  Let  ki, . . .  ,kp  E  [0,n  —  1]  be  a  collection  of  distinct  harmonics  such  that  the 
matrix  B  —  {bij)  where  bij  =  ig  invertible.  Let  I  =  (io,  •  •  ■ ,  In-i)  G 

he  any  current  vector  that  consists  of  half-period  piecewise  combinations  of  those  har¬ 
monics.  Then,  given  io,  ■  •  - ,  i^-i  and  {5fc(t)|A:  E  {ki, . . . ,  kp},  t  E  {^\m  E  [0, 
it  is  possibly  to  completely  determine  I  uniquely. 

Proof.  This  can  be  shown  by  induction.  As  the  base  case,  note  that  Zo,...,i|-i  are 
uniquely  specified.  Then,  for  each  t  E  {™|m  E  [0,  Hit, ... ,  h+^-i  are  uniquely 

specified  then  it+^,  ■  ■  ■  can  be  uniquely  determined  as  follows. 

Let  mi  =  Xljii  ^  ije'~^''L  Each  of  the  mi  are  determined,  uniquely,  by  the 
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already  known  values  of  I.  Let  ai,...,ap  G  C,  f{x)  =  Then  ij  = 

/(j  —  G  {t  —  +  n  —  1}  for  some  setting  of  Oi, . . .  ,  because  I  consists  of 

half-period  piecewise  combinations  of  the  harmonics  ki, . . . ,  kp.  Then, 


t+n—l 


Ski{t)  =  X! 

j=t 


t+f-i 


t+n—l 


=  E  +  E 


IjB 


j=t 


t=t+f 


--1 

/i=0 

—  —  1 

=  mi  -f-  ^  ^ 
h=0  j=l 

—  —1 

=  m;  4-  (-1)^''  ^ 

/i=0  j=l 

—  —  1 

=  m;  +  (-1)^'  X!  X] 

1=1  ?j=0 

p 

=  mi  + 

t=i 

where  bij  =  ^  _  [oi, . . . ,  be  the  1  xP  colnmn  vector  of  the  a^s, 

let  P  =  {bij)  be  thepxp  matrix  of  the  fcjjs,  and  let  C  =  [(-l)*i5fci  -mi, . . . ,  (— l)^pS'fcp  — 
mp]^  be  a  1  X  P  column  vector.  Then,  the  above  becomes  BA  =  C  which  has  the  unique 
solution  A  —  B~^C  exactly  when  B  is  invertible.  Since  B  is  invertible  by  assumption, 
this  immediately  implies  that  it+^ it+n-i  are  uniquely  determined.  Inducting  on  t 
completes  the  proof.  □ 
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The  applicability  of  this  lemma  depends  on  the  invertibility  of  this  matrix  B. 
A  necessary  and  sufficient  condition  for  the  invertibility  of  B  is  shown  in  the  following 
lemma. 

Lemma  4.  Let  ki,...,kp  G  [0, n  —  1]  be  a  collection  of  distinct  harmonics,  and  let 
B  =  (bij)  be  the  p  x  p  matrix  with  entries  given  by  bij  =  J2h==o  Then  B  is 

invertible  if  and  only  if  p  <  f . 


Proof.  Let  yo, . .  ■ ,  Pn-i  be  n  element  vectors  where  the  hf^  element  of  vector  yj  is  given 
by 

yAh]  =  e^^\ 

Thus,  the  Pj  are  simply  the  DFT  basis  functions.  Let  xi, ...  ,Xp  denote  n  element  vectors 
which  are  given  by  the  p  elements  of  the  set  {pj}  that  correspond  to  elements  in  {kj}. 
That  is  to  say  Xj  =  pk^  ■  Let  xl, . . . ,  ^  be  n  element  vectors  that  are  defined  by 


Similarly,  let 


xAh] 


Xj[h]  ^  <  f 

0  otherwise 


vA^] 


pAh]  <  f 

0  otherwise 


Recall  that  the  Gram  matrix  of  a  set  of  n  element  vectors  {ui, . . . ,  Vp\  is  the  pxp 
matrix  given  by  G  =  (fi'iy)  where 


9i,j  =  {vi,Vj). 
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Furthermore,  recall  that  G  is  invertible  if  and  only  if  the  vectors  {ui, . . . ,  Vp}  are  linearly 
independent.  Clearly,  B  is  the  Gram  matrix  of  Thus,  B  is  invertible  if  and 

only  if  {T7,  . . .  ,x^}  is  a  linearly  independent  set. 

Next,  notice  that  yo,  •  ■  • ,  Vn-i  is  an  orthogonal  basis  of  the  space  C”.  Moreover, 
^, . . . ,  y„_i  are  the  projection  of  yo, .  ■  •  ,yn-i  onto  the  space  ((C2  0  O2)  =  C2 .  Clearly, 
^, . . . ,  Hn-i  span  the  |  dimensional  space  ((C^  ©  0^).  Thus,  any  subset  of  |  elements 
of  the  set  {y^, . . . ,  yT^}  must  be  a  basis  of  ((C^  ©  0?),  which  immediately  implies  that 
any  subset  of  {^, . . . ,  yjjTj"}  of  size  at  most  |  is  linearly  independent,  and  of  size  at  least 
I  is  not  linearly  independent. 

By  construction,  {Tf, . . .  ,^}  C  {^, . . .  ,yj^}.  Therefore,  {xj, . . .  ,^}  is  linearly 
independent  if  and  only  if  p  <  | . 

□ 

Based  on  observations  of  a  variety  of  loads,  [1],  it  appears  that  many  real  electrical 
loads  consist  (approximately)  of  half-period  piecewise  combinations  of  a  small  number  of 
harmonics.  Thus,  it  is  reasonable  to  expect  that  the  above  assumptions  might  actually 
be  valid  in  practice.  Figure  3,  below,  shows  the  raw  current  drawn  by  a  variety  of  loads 
which  have  this  property. 

This  section  concludes  by  considering  an  algorithm  P  which  takes  as  input  L  and 
any  sufficient  set  of  spectral  envelope  values  That  is  to  say,  the  set  of  spectral 

envelope  values  is  sufficient  to  uniquely  determine  the  characteristic  current  vector  Ij  for 
any  load  Ij  E  L.  P  makes  its  prediction  by  the  applying  the  same  maximum  likelihood 
rule  used  by  algorithms  A  and  E.  That  is  to  say,  it  makes  the  prediction 


I  =  argmaxi.^LPr[l  = 
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Figure  4.3:  Turn-on  Transients.  This  figure  illustrates  turn  on  transients  for  a  variety 
of  electrical  loads.  These  three  plots  show  a  turn  on  transient  for  an  incandescent  light 
bulb,  a  motor,  and  a  computer  power  supply,  respectively. 
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The  following  lemma  shows  that  such  an  algorithm  will  perform  exactly  as  well  as  the 
algorithm  discussed  earlier,  whose  input  includes  all  spectral  envelope  values. 

Lemma  5.  For  algorithms  P  and  E,  as  defined  above,  let  Ip  and  Ip  denote  the  predictions 
made  by  algorithms  P  and  E,  respectively.  Then,  Pr[Ip  =  1]  =  =  1]  always. 

Proof.  The  proof  of  them  lemma  is  analogous  to  the  proof  of  Lemma  1.  Notice  that, 
by  construction,  both  E  and  P  are  optimal  algorithms  in  the  sense  that  they  make  the 
most  likely  prediction  given  their  input.  Moreover,  notice  that  the  input  to  P  could  be 
constructed  from  the  input  to  E  (because  it  is  a  subset  of  that  input)  and,  furthermore, 
that  the  input  to  E  could  be  constructed  from  the  input  to  P  (because  the  input  to 
P  is  sufficient  to  uniquely  determine  I  and  therefore  to  uniquely  determine  all  spectral 
envelopes). 

Thus,  by  the  logic  of  the  proof  of  Lemma  1,  if  E  (resp.  P)  performed  better  than 
P  (resp.  E)  in  any  case,  this  would  contradict  the  optimality  of  P  (resp.  E)  because  it 
would  allow  some  other  algorithm  C  to  be  formed  which,  on  the  same  input  as  P  (resp. 
E)  would  predict  more  accurately  than  E  (resp.  P)  by  using  P  (resp.  E)  as  a  subroutine. 
This  contradiction  immediately  implies  that  Pr\lp  =  1]  =  Pr\lE  =  1]  always.  □ 


4.4  EM  Algorithm 

The  previous  section  discussed  the  theoretical  applicability  of  using  spectral  enve¬ 
lope  data  for  the  purpose  of  classification.  As  was  shown,  a  maximum-likelihood  classifier 
using  a  sufficient  set  of  spectral  data  is  exactly  as  accurate  at  a  maximum-likelihood  clas¬ 
sifier  operating  on  the  raw  current  data.  This  section  will  consider  a  practical  way  to 
perform  classification  using  the  EM  algorithm  [8] . 
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The  fundamental  goal  of  the  EM  algorithm  is  to  be  able  to  take  a  large  collection  of 
data,  where  each  piece  of  data  is  a  set  of  spectral  envelope  data  for  a  single  turn-on  event, 
and  split  that  data  into  a  collection  of  clusters,  where,  ideally,  each  cluster  will  contain 
all  of  the  turn-on  events  for  a  single  load.  To  be  precise,  the  EM  algorithm  produces  a 
maximum  likelihood  estimation  of  a  set  of  unknown  parameters  given  incomplete  data. 
In  this  setting,  the  incomplete  data  is  the  collection  of  spectral  envelope  data  obtained 
from  many  turn-on  events.  This  data  is  incomplete  because  the  identity  of  the  load 
represented  in  each  turn-on  event  is  initially  unknown  (it  is  the  goal  of  this  classification 
algorithm  to  determine  this  identity).  In  the  following,  the  “label”  of  a  piece  of  (spectral 
envelope)  data  will  refer  to  the  identity  of  the  load  that  produced  that  data.  Data  which 
has  a  label  will  be  referred  to  as  “labeled”  and  data  without  a  label  will  be  referred 
to  as  “unlabeled”.  The  parameters  that  are  being  estimated  by  the  algorithm  are  the 
parameters  that  determine  the  clusters. 

For  the  sake  of  clarity,  before  providing  a  detailed  description  of  the  operation  of 
the  EM  algorithm,  a  simple  example  will  be  considered.  In  this  example,  each  cluster 
will  be  given  by  a  multidimensional  Gaussian  in  spectral  envelope  space.  That  is  to  say, 
if  p  significant  spectral  envelopes  are  recorded,  at  each  of  h  different  points  in  time,  then 
the  space  of  interest  will  be  isomorphic  to  and  each  cluster  will  be  given  by  a  ph 
dimensional  complex  Gaussian.  Thus,  each  cluster  is  represented  by  a  probability  density 
function  where  the  value  of  this  function  at  any  point  in  (spectral  envelope)  space  signifies 
the  density  of  probability  that  the  device  represented  by  that  cluster  would  produce  the 
spectral  envelope  data  represented  by  that  point.  Each  Gaussian  cluster  will  be  assumed 
to  be  spherically  symmetric;  that  is  to  say,  its  covariance  matrix  is  given  by  cr^Iph,  where 
Iph  is  the  ph  X  ph  identity  matrix  and  is  a  variance.  Furthermore,  in  this  example, 
all  clusters  will  be  assumed  to  have  the  same  covariance  matrix.  Thus,  the  unknown 
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parameters  that  specify  the  M  clusters  (one  cluster  for  each  of  the  M  electrical  loads) 
are  the  means  of  each  cluster  /^i, . . . ,  /tm  and  the  single  common  variance  cr^.  It  should  be 
noted  that  these  parameters  really  do  suffice  to  completely  specify  the  clusters  because  a 
Gaussian  is  determined  by  its  mean  and  covariance  matrix.  In  addition  to  the  unknown 
parameters  that  specify  the  clusters,  the  apriori  probability  distribution  of  load  selection 
is  also  unknown.  As  discussed  in  Section  1,  pi{lj)  is  the  probability  that  load  Ij  is  the  load 
in  the  “black-box.”  This  distribution  is  determined  by  M  —  1  parameters  (there  are  M 
loads,  and  the  sum  of  the  probabilities  over  all  loads  must  be  1).  The  EM  algorithm  will 
attempt  to  estimate  the  2M  unknown  parameters  (M  + 1  that  specify  the  clusters,  M  —  1 
that  specify  the  apriori  distribution)  from  the  data  by  finding  the  maximum  likelihood 
setting  of  these  parameters. 

Before  proceeding  further,  it  is  worth  noting  that,  while  many  specific  assumptions 
were  made  in  the  above  example  about  the  structure  of  the  clusters,  these  assumptions 
are  actually  quite  realistic,  in  certain  settings.  For  example,  consider  raw  current  mea¬ 
surements  that  are  corrupted  with  additive  white  Gaussian  noise  (AWGN).  It  can  easily 
be  shown  that  every  spectral  envelope  will  also  be  corrupted  by  Gaussian  noise,  and, 
moreover,  that  the  variance  of  the  noise  in  each  spectral  envelope  will  be  equal  (because 
the  noise  is  white).  Thus,  in  the  AWGN  case,  the  assumption  that  the  clusters  will 
be  spherically  symmetric  Gaussians  with  identical  covariance  matrices  would  be  exactly 
correct. 

To  describe  the  EM  algorithm  precisely,  a  bit  of  notation  needs  to  be  introduced. 
Let  0  be  a  vector  which  specifies  the  parameters  of  the  clusters  (in  the  above  example, 
9  consists  of  the  means  and  common  variance  of  the  clusters  as  well  as  the  probability 
distribution  that  each  cluster  ).  Let  x  be  the  vector  of  unlabeled  data,  which  consists 
of  spectral  envelope  values  at  each  turn-on  event.  To  be  precise,  if  the  data  set  includes 
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data  from  r  turn-on  events,  then  a:  is  an  r  element  vector  where  the  element  of  this 
vector  is  the  collection  of  spectral  envelope  data  during  the  turn-on  event.  Thus,  each 
element  of  x  isn’t  a  single  number,  but  rather  an  entire  collection  of  data.  The  vector  x 
is  a  single  sample  value  from  the  random  vector  X.  Let  Z  denote  the  random  vector  of 
labels  (each  element  is  the  identity  of  the  load  that  produced  a  particular  collection  of 
spectral  envelopes,  stored  in  part  of  x,  when  turned  on)  and  z  denote  a  particular  setting 
of  labels  {z  is  a  particular  sample  value  from  Z).  Let  L{9,x,z)  denote  the  likelihood 
function,  which  specifies  the  likelihood  that  clusters  with  parameters  d  would  correspond 
to  data  X  and  z. 

The  algorithm  operates  is  a  series  of  phases,  where  each  phase  produces  a  new 
(and  hopefully  better)  estimate  of  9]  let  9^^^  denote  the  estimate  of  9  produces  in  round 
t.  Each  phase  consists  of  two  steps:  the  expectation  step  (El-step)  and  the  maximization 
step  (M-step). 

In  the  E-step,  the  expected  value  of  the  log  of  likelihood  function  is  calculated 
(where  expectation  is  performed  with  respect  to  the  conditional  distribution  of  Z  given 
9  and  x).  This  expected  value,  in  step  t,  is  denoted  by  Q{9\9^^^)  and  is  given  by 


(4.3) 


In  the  M-step,  a  new  estimate  of  parameters,  is  produced  by  selecting  the 

parameter  9  which  maximizes  the  quantity  calculated  in  the  E-step.  Thus, 


^(t+i)  _  argmax0Q{9\9^*^).  (4.4) 

To  apply  the  EM  algorithm  to  classifying  loads  on  the  basis  of  spectral  envelope 
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data,  the  assumptions  of  the  above  example  will  be  made,  with  the  exception  of  the  fact 
that  no  assumptions  will  be  made  about  the  covariance  matrices  of  each  Gaussian  cluster. 
That  is  to  say,  different  clusters  may  have  different  covariance  matrices  and  each  cluster 
may  or  may  not  be  spherically  symmetric.  Thus,  6  is  the  3M  element  parameter  vector 
given  by 

0  —  (/^i  ^  •  •  •  j  ■  7  )  Pi  7  •  ■  •  j  Pm  ) 


where  p-i, ...  jPm  are  the  means  of  the  M  clusters,  Ei, . . . ,  Sju  are  the  covariance  ma¬ 
trices  of  each  cluster,  and  pi,. ..  ,Pm  are  the  apriori  probabilities  that  loads  h, . . . ,  Im, 
respectively,  are  turned  on.  These  apriori  probabilities  have  the  constraint  Ylf^iPi  ~ 

The  likelihood  function,  L{9,  x,  z),  is  given  by 


R 

L{9,x,z)  =  Y[pzifzi{xi), 

i=l 


(4.5) 


where  fj{x)  denotes  the  probability  density  function  of  the  cluster,  Xi  denotes  the  zth 
piece  of  data,  and  R  denotes  the  number  of  pieces  of  data.  This  cluster  is  a  multivariate 
Gaussian  with  mean  p.j  and  covariance  Recall  that  the  probability  density  function 
f{x)  of  a  multivariate  Gaussian  with  mean  p  and  covariance  matrix  E  is  given  by 


y^{2'n')^  det(Ej) 


(4.6) 


Therefore,  by  substituting  (4.6)  into  (4.5),  and  rearranging,  the  likelihood  function  is 
given  by 

R 


L(9,x,z)  -  Yi 


Pzi 


f=  1  \/(27r)^det(E^J 

R  ln( _ ==£Sj==) 

g  ^ {27r)^  det(Sz^)  g~  2 

i=l 
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=  exp(^(ln(p^J  -  y  ln(27r)  -  ^  ln(det(S,J))  -  ^iz,)).  (4.7) 

1=1 

The  likelihood  function  is  expressed  in  exponential  family  form  for  convenience  in  later 
calculation. 

The  goal  of  the  E-step  is  to  calculate  Q{9\9^^'>)  =  E2;\x,0(t)  In  L{9,  x,  Z).  To  do  this, 
we  must  first  calculate  P{Zi  =  lj\Xi  =  Xj).  By  Bayes’  Theorem  and  the  definition  of 
conditional  probability,  this  is  given  by 


P{Zi  =  lj\Xi  =  Xi)  = 


PjXi  =  Xi\Zi  =  lj)P{Zi  =  Ij) 
P{Xi  =  Xi) 


P(Xi  =  Xi\Zi  =  lj)P{Zi  =  Ij) 

Eti  P{Xi  =  Xi\Zi  =  l^)P{Zi  =  Ij) 


Ef=i  fk{xi)pk 


(4.8) 


Using  this  equation  for  P{Zi  =  lj\Xi  —  Xi),  and  (4.3),  we  can  immediately  write 
a  closed  form  equation  for  the  calculation  performed  in  the  E-step. 


Q(0|0W)  =  ^  lnLi9,x,Z) 


R 

Ez\xM‘)  hi(exp(^(ln(p^J  -  ^  ln(27r)  -  |  ln(det(S^J))  -  hxi  -  -  fi,.))) 


i=l 


M 


^P{Zi  =  lj\Xi  =  Xi)^(ln(p^J-y  ln(27r)-^ln(det(S^J))-i(x~;u^J^EE(2;i-//^J 

j=\  i=l 

EE  \-~\R  £  7r77-(ln(p^J  -  ^  ln(27r)  -  ^  ln(det(E^J)  -  i(xi  -/r^,)^E^/(xi  -/r^J). 


7=^1  X/fc=l  fk{^i)Pk 


(4.9) 


In  the  M-step,  we  choose  a  set  of  parameters  to  maximize  the  quantity 
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Q{0\9^*^)  determined  in  the  E-step.  Notice  that  Q{9\O0^)  has  a  particularly  simple 
form.  None  of  the  parameters  Pi,  ■  ■  ■  ,Pm  share  a  term  with  any  of  the  other  param¬ 
eters  pi,. .  ■ ,  Pm,  Si)  •  •  • )  S^.  Thus,  the  values  of  pi, . . .  ,Pm  that  maximize  Q{9\d^*'>)  can 
be  determined  independently  of  pi, . . . ,  pu,  Si, . . . ,  Sm-  Let  {Px  \  ■  ■  ■  ,Pm)  denote  the 
vector  composed  of  the  probabilities  pi, , pm  produced  in  the  preceding  M-step.  Then 
{Pi  \  .  ■ .  ,Pm)>  the  estimates  produced  in  the  current  M  step,  are  given  by 


M 


fjixi)p] 


9) 


^  j=i  ELi  fkixi)p 


(t) 


Notice  that  this  has  exactly  the  same  form  as  the  maximum-likelihood  estimator  for  the 
binomial  distribution.  Thus, 


1 


Similarly,  the  pairs  of  parameters  (^i,  Si), . . . ,  (/jM)  Sm)  also  appear  in  separate 
terms  from  one  another  and  from  the  parameters  Pi, . . .  ,Pm-  Therefore,  the  values  of 
the  parameters  (pj,Sj)  that  maximize  Q(0|0d))  can  be  determined  independently  of  all 
other  parameters.  Thus, 


R 


Sj.*^^^)  =  argmaxp.^j:.  ^ 


fj{^i)Pj 


^  T.k=i  fk{xi)Pk^‘^ 


(i  \n{det{Ej))-l{xi-pjfi:-  \xi-pj)). 


Notice  that  this  has  exactly  the  same  form  as  the  maximum-likelihood  estimator  for  a 
Gaussian  distribution.  Thus 


(t+1)  _  Y^k=l  fk{^i)Pk 

^  _ fj  (x^)pj _ 


(4.10) 
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and 


„(t+l)  _  Sfe=l  fk{^i)Pk 

~  fl(^i)Pi 

'Ek=l  fk{xi)Pk 


(4.11) 
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Chapter  5 


An  FPGA-based  Spectral  Envelope 
Preprocessor 

5.1  Background 

Power  electronics  and  power  electronic  controls  are  proliferating  in  consumer  elec¬ 
tronics.  There  is  an  increasing  expectation  that  advanced  power  conditioning  electronics 
will  play  a  role  in  managing  and  coordinating  power  consumption  not  simply  for  a  par¬ 
ticular  load,  e.g.,  a  variable  speed  drive  in  an  air  conditioning  plant,  but  also  in  response 
to  the  dynamic  needs  and  capability  of  the  utility  system.  Loads  that  can  respond  not 
only  to  their  own  tasking  but  also  to  the  needs  of  the  utility  are  implicit  in  many  visions 
of  a  “smart  grid.” 

There  is  a  need  for  flexible,  inexpensive  metering  technologies  that  can  be  deployed 
in  many  different  monitoring  scenarios.  Individual  loads  may  be  expected  to  compute 
information  about  their  power  consumption.  They  may  also  be  expected  to  communicate 
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information  about  their  power  consumption  through  wired  or  wireless  means.  Switch  gear 
like  circuit  breaker  panels  may  eventually  be  expected  to  provided  detailed  submetering 
information  for  different  loads  on  different  breakers  or  clusters  of  breakers  and  controls. 
New  utility  meters  will  need  to  communicate  bidirectionally,  and  may  need  to  compute 
parameters  of  power  flow  not  commonly  assessed  by  most  current  meters. 

Appropriate  sensing  hardware  and  information  delivery  systems  remain  a  chief 
bottleneck  for  many  applications.  Both  vendors  and  consumers  will  likely  find  innumer¬ 
able  ways  to  mine  information  if  made  available  in  a  useful  form.  However,  metering 
hardware  and  access  to  metered  information  will  likely  limit  the  implementation  of  new 
electric  energy  conservation  strategies  in  the  near  future.  The  U.S.  Department  of  Energy 
has  identified  “sensing  and  measurement”  as  one  of  the  “five  fundamental  technologies” 
essential  for  driving  the  creation  of  a  “Smart  Grid”  [13].  Consumers  will  need  ’’sim¬ 
ple,  accessible...,  rich,  useful  information”  to  help  manage  their  electrical  consumption 
without  interference  in  their  lives  [13]. 

Digital  technology  has  been  in  use  for  over  20  years  for  measuring  and  meter¬ 
ing  power  flow.  A  few  examples  from  an  enormous  array  of  metering  and  measurement 
approaches  for  monitoring  power  can  be  found  in  [14],  [15],  and  the  references  in  these 
documents.  Digital  power  monitoring  has  also  made  its  way  to  the  “plug”  and  “power 
strip”  level,  e.g.,  see  [17].  Many  different  schemes  for  storing  or  communicating  informa¬ 
tion  are  still  under  exploration  see  [16]  and  its  references  for  example.  Most  of  these 
solutions  deploy  computation  hardware  that  is  either  substantially  complicated  in  both 
hardware  and  firmware,  e.g.,  [14]  where  a  DSP  and  a  micro-controller  work  together  to 
coordinate  computation  of  real,  reactive,  and  apparent  power,  or  where  fully  integrated 
custom  chips  are  specifically  developed  for  a  particular  application. 

The  “spectral  envelope”  representation  of  observed  current  and  voltage  signals 
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used  in  the  non-intrusive  load  monitor  or  NILM  [1]  can  be  a  very  flexible  basis  for 
computing  and  tracking  all  sorts  of  useful  metrics  about  power  consumption.  Spectral 
envelopes  estimate  real  and  reactive  power  consumption  and  harmonic  contents.  Also, 
even  for  waveforms  with  substantial  high  frequency  content,  the  frequency  content  of 
the  spectral  envelopes  can  be  made  relatively  band-limited.  Spectral  envelopes  are  often 
a  natural  way  to  “compress”  useful  data  about  load  current  and  power  consumption, 
easing  communication  requirements. 

This  chapter  describes  an  integer-arithmetic  implementation  of  a  spectral  enve¬ 
lope  preprocessor  for  an  inexpensive  FPGA.  The  spectral  envelope  FPGA  coordinates 
data  acquisition  and  computes  spectral  envelopes  without  the  need  for  floating  point 
computation.  Hence,  the  FPGA  can  be  used  in  a  two-IC  suite  (with  an  analog-to-digital 
converter),  to  inexpensively  acquire  load  consumption  data.  This  data  minimizes  the 
need  for  “downstream”  computation  later  in  the  signal  processing  workflow.  Of  course, 
further  computation  can  be  used  to  track,  trend,  price,  or  control  energy  consumption. 
The  FPGA  can  directly  control  communication  as  well,  providing  wired  or  wireless  ac¬ 
cess,  or  storage  on  flash  memory  or  other  media.  The  spectral  envelope  FPGA  is  a 
cost-effective  building  block  that  can  be  used  to  enable  a  huge  array  of  power  monitoring 
and  control  applications,  ranging  from  the  individual  load  up  through  the  breaker  panel 
or  utility  service  entry  level  and  beyond.  It  can  be  used  to  provide  necessary  power 
consumption  information  for  coordinating  power  electronic  controls. 

This  chapter  describes  the  design  of  this  key  building  block  and  shows  results 
from  a  prototype.  The  next  section  describes  the  approach  for  using  spectral  envelopes 
for  load  identification  and  how  this  data  is  computed  by  the  preprocessor.  The  following 
section  presents  the  FPGA-based  spectral  envelope  preprocessor  and  the  techniques  used 
to  implement  the  preprocessor  cost-elfectively.  Finally,  the  performance  of  prototype 
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hardware  is  examined  and  further  enhancements  for  expanded  monitoring  applications 
are  described. 


5.2  Utility  of  Spectral  Envelopes 

Typical  turn-on  current  transients  are  shown  in  Fig.  5.1  for  an  incandescent  lamp, 
a  universal  motor  (as  in  a  vacuum  cleaner  or  hand  tool),  and  a  personal  computer. 
Dynamic  changes  in  the  power  and  harmonic  consumption  of  a  load,  e.g.,  during  turn-on 
or  turn-off  transients,  can  serve  as  a  fingerprint  for  identifying  load  operation  [1].  For 
example,  an  observed  turn-on  transient  or  exemplar  from  a  training  observation  produced 
by  one  of  a  collection  of  loads  can  be  used  to  identify  the  load  in  an  aggregate  current 
measurement.  An  analogous  procedure  can  be  performed  using  turn-off  transients.  All 
that  is  needed,  in  principle,  to  determine  the  operating  schedule  of  a  collection  of  loads 
is  to  record  the  aggregate  current  drawn  by  those  loads  and  then  match  each  observed 
transient  to  the  turn-on  or  turn-off  fingerprint  of  a  particular  load  in  the  collection. 

Direct  examination  of  current  waveforms  may  not  be  practical  for  many  stages 
of  some  applications,  including  many  components  in  energy  scorekeeping,  monitoring,  or 
conservation  systems.  Direct  operations  on  the  current  waveform  require  sample  rates 
adequate  to  capture  the  highest  harmonic  content  of  interest  [1].  In  some  metering, 
monitoring,  and  control  applications  it  is  more  practical  to  either  store  data  for  a  period 
of  time  and  examine  it  later,  or  transmit  data  to  another  location  for  interpretation  and 
control.  In  either  of  these  cases,  it  is  convenient  to  have  a  useful  representation  of  the 
data  that  avoids  excessive  storage  or  communication  bandwidth  requirements. 

Spectral  envelopes  provide  a  useful  separation  between  data  collection  and  anal¬ 
ysis.  They  permit  a  small,  inexpensive  system  with  low  processing  power  to  collect  data 
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Figure  5.1:  Turn-on  transients  for  an  incandescent  light  bulb,  a  motor,  and  a  computer 
power  supply,  respectively. 
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continuously.  A  system  with  larger  available  processing  power,  potentially  physically  re¬ 
mote  from  the  data  collection  front  end,  can  either  review  a  storage  device  at  a  later  time 
or  continuously  process  a  relatively  low  bandwidth  information  stream  over  a  convenient 
communication  channel,  wired  or  wireless. 

The  spectral  envelopes  of  current  are  defined  at  each  line-locked  period  of  the 
service  voltage.  If  i[n]  represents  the  samples  of  current,  and  there  are  N  samples  taken 
per  cycle,  then  the  spectral  envelopes  of  current  are  given  by  ao[m], . . .  ,aAr_i[m]  and 
6o[m], . . . ,  6jv-iM,  where  m  indexes  the  period.  These  spectral  envelopes  are  defined  as 


and  similarly, 


aj[m 


N-l  „  , 

=  ^  i[mN  -f  k]  ■  sin{-^) 

k=0 


N-l 


fc=0 


N 


(5.1) 


(5.2) 


Spectral  envelopes  can  be  expressed  in  terms  of  the  Discrete  Fourier  Transform 
(DFT)  [2].  The  DFT  transforms  a  sequence  of  N  complex  numbers,  a:[0], . . .  ,x[A’  —  1], 
to  another  sequence  of  N  complex  numbers  A'[0], . . . ,  —  1]  by 


X[k]  = 

n=Q 


where  here  j,  rather  than  i  is  used  to  represent  to  avoid  confusion  with  current. 

The  inverse  of  this  transformation,  the  IDFT,  is  given  by 


x[n 


1 

N 


J]  A[A;]e^'=". 

fc=0 


Thus,  the  spectral  envelope  values  are  simply  given  by  the  real  and  imaginary  parts  of 
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the  DFT  of  current. 


Given  the  DFT  coefficients  over  one  period  of  the  service  voltage,  it  is  possible  to 
exactly  reconstruct  or  preserve  all  of  the  information  in  the  raw  current  samples  over  that 
period.  Of  course,  simply  recording  all  of  the  DFT  coefficients  will  not  reduce  the  data 
handling  requirements  -  if  there  are  N  samples  of  current,  the  DFT  will  transform  these 
N  samples  to  N  DFT  coefficients.  While  it  may  appear  that  storing  the  DFT  coefficients 
would  take  twice  as  much  space  as  storing  the  raw  current  samples,  because  they  are 
complex  numbers,  it  should  be  noted  that  the  DFT  will  be  symmetric  (because  the  raw 
current  samples  are  real),  so  only  y  of  the  N  complex  numbers  need  to  be  stored;  thus, 
the  storage  requirement  for  storing  all  meaningful  DFT  coefficients  is  the  same  as  storing 
all  raw  current  samples  (when  both  are  stored  to  the  same  level  of  precision).  However, 
as  observed  in  [10],  in  situations  where  the  significant  or  relevant  current  drawn  by  an 
electrical  load  consists  predominantly  of  the  fundamental  frequency  (the  frequency  of 
the  service  voltage,  for  example,  60  Hz)  and  a  small  collection  of  the  line  frequency 
harmonics  (such  as  the  2"^,  3^*^,  5*’^),  it  is  reasonable  to  record,  for  each  period  of  the 
service  voltage,  only  a  few  DFT  coefficients.  These  relatively  few  DFT  coefficients  can  be 
used  to  reconstruct  the  original  current  samples  with  a  relatively  small  error.  Also,  the 
time  varying  values  of  the  DFT  coefficients  themselves  can  be  used  directly  as  fingerprint 
signatures  for  the  loads,  or  to  track  important  quantities  associated  with  load  operation 
with  reasonable  accuracy. 

Because  only  a  few  DFT  coefficients  may  be  needed  to  accurately  represent  the 
current  waveforms,  this  “spectral”  approach  to  the  representation  of  the  waveforms  serves 
as  a  form  of  compression.  As  a  concrete  example,  consider  current  and  voltage  samples 
that  are  collected  at  a  7.68  KHz  sample  rate.  The  sampling  rate  must  be  sufficiently 
high  in  order  to  provide  adequate  anti-aliasing  without  filtering  effects  and  to  provide 
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adequate  detection  of  voltage  zero  crossings  to  enable  line-locked  data  collection.  This 
corresponds  to  128  samples  per  60  Hz  line-cycle  {N  =  128),  and  so  64  meaningful  complex 
DFT  coefficients  need  to  be  stored  to  perfectly  reconstruct  arbitrary  current  samples. 
However,  for  many  applications,  including  load  monitoring  for  diagnostics,  only  a  limited 
number  of  DFT  coefficients  need  be  stored.  In  the  prototype  system  discussed  here, 
just  4  coefficients,  or  6.25%  of  the  full  set  of  already  compact  harmonically-related  DFT 
coefficients,  were  needed. 

Of  course,  other  reductions  of  the  data  could  be  applied,  e.g.,  simply  recording 
average  aggregate  real  power  once  per  second  (where  the  average  is  taken  over  each  second 
interval),  leads  to  further  compression.  Such  data  would  not  reflect  the  detailed  short 
term  variations  that  would  occur  in  real  power,  nor  would  it  reflect  any  of  the  behavior 
of  the  higher  harmonics.  Time-varying  DFT  coefficients  or  spectral  envelopes  strike  a 
balance  between  the  need  to  store  or  transmit  as  little  data  as  possible  and  the  need  to 
maintain  sufficiently  detailed  data  to  be  able  to  accurately  perform  load  monitoring  of 
interest. 

Spectral  envelopes  can  be  directly  interpreted  as  other  meaningful  physical  quan¬ 
tities  under  some  conditions.  In  situations  as  illustrated  in  Fig.  5.2  where  the  utility 
voltage  is  relatively  harmonic  free  and  “stiff”  (with  constant  peak  amplitude),  the  real 
part  of  the  fundamental  frequency  spectral  envelope  or  DFT  coefficient  of  the  current 
waveform  scales  to  “real  power”  in  steady-state.  The  imaginary  part  scales  to  reactive 
power.  If  the  voltage  is  not  stiff  or  harmonically  pure,  then  it  is  still  possible  to  accurately 
estimate  real  and  reactive  power  by  also  storing  the  same  set  of  DFT  coefficients  of  the 
voltage  (i.e.,  if  the  and  7*^  DFT  coefficients  of  current  are  stored,  then  these 

same  four  coefficients  of  voltage  should  also  be  stored)  By  definition,  time  averaged  real 
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power  (over  one  line-cycle)  is  given  by 

1  N-i 

P  =  —  5]]  z[n]r;[n], 

n=0 

where  i[n]  and  v[n]  are  the  samples  of  current  and  voltage,  respectively,  over  one  period. 
Let  Ik  and  14  denote  the  (complex)  amplitudes  of  the  harmonics  of  current  and 
voltage,  respectively.  Using  the  Plancherel  Theorem,  real  power  can  be  expressed  in 
terms  of  the  DFT  of  current  and  voltage, 

N-l 

k=0 

Reactive  power  can  be  calculated  in  an  analogous  fashion.  Thus,  if  it  is  indeed  true  that 
only  a  small  number  of  DFT  coefficients  of  current  are  not  approximately  zero,  then 
an  accurate  approximation  of  real  and  reactive  power  could  be  obtained  by  storing  only 
the  few  significant  DFT  coefficients  of  current,  and  the  same  set  of  DFT  coefficients  of 
voltage. 

Figure  5.2,  shown  below,  shows  the  line  voltage,  aggregate  current,  and  prepro¬ 
cessor  output,  during  the  turn  on  of  a  device  that  draws  exclusively  real  power.  The 
only  non-zero  preprocessor  envelope  is  the  envelope  that  corresponds  to  real  power.  It  is 
important  to  note  that,  because  the  only  non-zero  DFT  coefficient  of  current  is  the  1®* 
coefficient  (fundamental),  this  envelope  is  truly  a  scaled  version  of  real  power,  regardless 
of  the  harmonic  content  of  the  voltage  waveform.  This  is  due  to  the  fact  that,  by  the 
above  logic,  the  only  harmonics  of  the  voltage  waveform  that  affect  real  and  reactive 
power  are  those  harmonics  that  are  also  present  in  current.  For  simplicity,  this  example, 
Eis  well  as  all  other  example  transients  in  the  remainder  of  this  section,  consist  of  simu¬ 
lated  data.  However,  it  should  be  noted  that,  even  though  the  analysis  in  this  section  is 
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(a)  Line  Voltage. 


(c)  Preprocessor  Output. 


Figure  5.2;  Sample  Preprocessor  Output.  The  top  two  plots  depict  the  measured  line 
voltage  and  aggregate  current,  respectively,  while  a  simulated  device  is  being  turned  on. 
The  third  plot  depicts  the  preprocessor  output. 


illustrated  with  simulated  examples,  the  analysis  applies  equally  well  to  real  data. 

Interesting  examples  arise  from  loads  that  draw  non-sinusoidal  or  harmonically 
distorted  current  waveforms,  like  some  personal  computers  or  compact  fluorescent  lamps. 
Figure  5.3  depicts  the  current  drawn  from  a  simulated  load  over  one  period  of  the  line 
voltage.  This  device  consumes  first  and  third  harmonic  current.  Calculated  preprocessor 
output  for  this  simulated  device  is  shown  in  Figure  5.4. 
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Figure  5.3:  View  of  current  over  one  period  of  the  line  voltage.  This  simulated  device 
draws  current  predominantly  at  the  first  and  third  harmonic. 
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(c)  Preprocessor  Output. 


Figure  5.4:  Sample  Preprocessor  Output.  The  first  two  plots  show  the  measured  line  volt¬ 
age  and  aggregate  current,  respectively.  The  third  pair  shows  the  preprocessor  outputs 
oi  and  03  corresponding  to  in-phase  current  draw  at  the  first  and  third  harmonics. 
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Figure  5.5:  Spectral  Envelope  Preprocessor  Block  Diagram.  An  Analog-to-Digital  con¬ 
verter  is  used  to  produce  samples  of  the  line  voltage  and  the  aggregate  current,  which 
are  then  used  by  the  FPGA  to  produce  spectral  envelope  coefficients.  These  coefficients 
can  be  stored  in  a  compact  flash  card  and  also  transmitted  via  802.11  WiFi  on  demand. 

5.3  FPGA-Based  Spectral  Envelope  Preprocessor 

To  calculate,  store,  and  communicate  a  relevant  subset  of  DFT  coefficients  for 
power  monitoring  and  energy  scorekeeping,  a  prototype  FPGA  (Field  Programmable 
Gate  Array)  was  constructed  to  implement  a  spectral  envelope  preprocessor.  All  Verilog 
code  can  be  found  in  Appendix  C.  This  system  makes  use  of  a  low-cost  FPGA  (Altera 
Cyclone  I,  EP1C3T100C8).  The  spectral  preprocessor  consists  of  four  subsystems:  a 
subsystem  that  obtains  current  and  voltage  samples,  a  subsystem  that  computes  spectral 
envelope  coefficients,  a  subsystem  that  stores  computed  spectral  envelope  coefficients,  and 
a  subsystem  that  can  transmit  the  spectral  envelope  coefficients  to  another  computation 
or  display  platform  for  further  analysis.  Each  of  these  subsystems  will  be  considered  in 
detail.  Figure  5.5  shows  the  overall  block  diagram  of  the  system. 

Data  flows  through  the  system  as  follows.  The  transducer  interface  circuitry  mea¬ 
sures  the  line  voltage  and  aggregate  current,  producing  the  signals  v{t)  and  i{t).  These 
signals  are  sampled  and  quantized  by  an  analog-to-digital  converter  (ADC)  that  pro¬ 
duces  the  samples  v[n]  and  i[n].  The  FPGA  processes  these  samples  to  compute  spectral 
envelopes.  The  spectral  envelope  coefficients  can  be  stored  in  a  Compact  Flash  (OF) 
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Figure  5.6:  FPGA  Block  Diagram.  The  preprocessor  is  used  to  calculate  spectral  envelope 
coefficients.  The  ADC  controller  is  used  to  control  the  sampling  scheme  of  the  Analog- 
to-Digital  converter.  The  CF  controller  interfaces  with  a  Compact  Flash  card  to  enable 
spectral  envelopes  to  be  stored  and  later  recalled.  The  WiFi  controller  interfaces  with 
an  802.11  WiFi  transceiver  to  transmit  spectral  envelope  data. 


card  for  later  use.  The  system  also  includes  an  802.11b/g  WiFi  transceiver  that  allows 
any  collection  of  the  spectral  envelope  coefficients  to  be  transmitted  to  another  compu¬ 
tation  device  for  analysis.  The  FPCA  provides  control  logic  for  each  of  the  subsystems. 
Figure  5.6  shows  a  block  diagram  of  the  system  implemented  in  the  FPCA. 


5.3.1  Current  and  Voltage  Measurement 

Current  and  voltage  measurements  from  at  least  one  voltage  channel  and  at  least 
one  current  channel  are  used  to  compute  spectral  envelopes.  The  system  is  easily  ex¬ 
panded  to  measure  more  than  two  channels,  supporting  three-phase  electrical  services, 
for  example.  The  prototype  system  uses  an  LA-55  current  transducer  to  measure  aggre¬ 
gate  current  and  a  simple  transformer  to  measure  the  line  voltage.  A  transformer  with 
dual  secondary  coils  was  used  in  the  prototype.  This  provides  one  coil  for  measurement 
purposes,  and  a  second  coil  for  powering  the  preprocessor.  The  two  coil  arrangement  pro¬ 
vides  a  voltage  sense  with  very  little  phase  distortion,  ensuring  accurate  calculation  of  in 
phase  and  quadrature  spectral  components.  Figure  5.7  illustrates  this  utility  connection. 
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Figure  5.7:  The  FPGA  system  receives  both  line  voltage  measurement  and  low- voltage 
supply  through  a  transformer  with  dual  secondary  coils. 

5.3.2  ADC  Controller 

In  many  signal  processing  applications,  a  computationally  efficient  algorithm  like 
the  Fast  Fourier  Transform  (FFT)  computes  the  complete  spectral  analysis  of  a  sampled 
waveform.  However,  in  situations  like  power  monitoring,  where  a  relatively  small  number 
of  spectral  coefficients  may  contain  all  or  most  needed  information,  needed  spectral  coef¬ 
ficients  can  be  computed  more  efficiently  by  a  traditional  DFT  implementation,  i.e.,  by 
mixing  observed  waveform  samples  directly  with  the  stored  samples  of  basis  sinusoids.  In 
this  approach,  basis  sinusoids  are  stored  in  a  memory  and  multiplied  by  observed  samples 
of  a  waveform.  If  there  are  N  samples  stored  in  memory  for  each  basis  sinusoid,  then 
it  is  necessary  to  acquire  N  samples  of  the  current  and  voltage  waveforms  for  each  line 
voltage  period. 

The  FPGA  coordinates  the  operation  of  the  ADC  (Analog  to  Digital  Converter) 
to  obtain  the  samples  i[n]  and  v[n]  of  the  current  and  voltage  waveforms  i(t)  and  v{t).  To 
provide  a  known  number  of  waveform  samples  per  line  period,  the  FPGA  “locks”  to  the 
line  voltage  waveform.  That  is,  the  FPGA  varies  the  sample  rate  to  track  with  variations 
in  the  line  voltage  frequency.  The  sampling  clock  is  derived  from  the  output  of  a  digital 
phase-locked  loop  (PLL)  on  the  FPGA  that  tracks  the  line  voltage  frequency. 

The  phase  of  the  sampling  is  set  such  that  the  first  voltage  and  first  current 
samples  are  taken  at  the  negative  to  positive  zero  crossing  of  the  line  voltage.  The  goal  is 
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of  the  line  voltage  and  aggregate  current.  Samples  are  taken  at  points  in  time  that 
correspond  to  the  samples  stored  in  the  basis  sinusoid  memory.  For  clarity,  this  figure 
only  depicts  8  sample  points  per  cycle,  while  the  prototype  system  actually  uses  128 
sample  points  per  cycle. 

to  multiply  each  entry  in  the  basis  sinnsoid  by  the  value  of  the  waveform  to  be  analyzed 
at  the  corresponding  point  in  time.  It  is  essential  that  the  entire  process  is  line  locked  to 
the  line  frequency  in  order  for  the  estimated  spectral  envelope  coefficients  of  current  to 
correspond  to  “in-phase”  and  “quadrature”  components  of  current  with  respect  to  the 
fundamental  of  the  voltage  waveform.  This  sampling  scheme  is  illnstrated  in  Figure  5.8. 
The  sample  times  and  values  are  indicated  by  diamonds. 


5.3.3  Envelope  Preprocessor 

Many  interesting  hardware  and  software  systems  that  can  calcnlate  spectral  en¬ 
velope  or  quantities  related  to  spectral  envelopes  have  previously  been  constructed.  Ref¬ 
erences  [14]  [16]  and  their  associated  references  describe  various  metering  schemes  that 

compute  real,  reactive,  and  apparent  power,  and  also  harmonic  distortion  in  one  form 
or  another  in  many  cases.  In  [1],  mnltiple  phase-locked  loops,  analog  multipliers  and 
integrators  were  used  to  estimate  spectral  envelope  coefficients.  In  [10],  a  design  using 
multiplying  digital-to-analog  converters,  low-pass  filters,  and  a  single  phase-locked  loop 
was  presented.  In  [11],  a  expensive  digital  signal  processing  board  was  used  to  perform 
the  calculations.  In  [12],  the  processing  power  of  a  personal  computer  was  used  for 
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Figure  5.9:  Signal  Flow  Graph.  This  diagram  depicts  the  signal  path  for  the  calculation 
of  a  spectral  envelope.  Raw  current  values  are  multiplied  by  the  appropriate  elements 
of  each  basis  sinusoid,  and  the  results  are  accumulated  over  each  period  to  produce  each 
spectral  envelope  value. 

spectral  envelope  coefficient  estimation. 

All  of  these  systems  can  provide  accurate  estimates  of  spectral  envelope  coefficients 
or  related  quantities.  They  serve  as  essential  building  blocks  of  various  types  of  metering 
systems.  They  are  often  expensive  and  dedicated.  The  FPGA-based  system  discussed 
in  this  section  is  an  inexpensive  single-chip  solution  that  can  estimate  spectral  envelope 
coefficients  for  stand-alone  use  or  as  part  of  a  turn-key  building  block  in  more  complex 
systems.  The  FPGA  computes  spectral  envelopes  using  integer  arithmetic  on  stored  basis 
waveforms  and  observed  waveform  samples. 

The  FPGA-based  spectral  envelope  preprocessor  calculates  the  spectral  envelopes 
of  cnrrent,  a,i[m],  bi[m], . . .,  where  m  indexes  the  periods  of  the  line  voltage.  Figure  5.9 
shows  the  computation  performed  to  produce  estimates  of  a  single  spectral  envelope 
coefficient,  aj[m].  The  system  multiples  i[n],  the  samples  of  current,  with  Vaj[n],  the 
samples  of  a  basis  sinusoid,  and  sums  the  result  over  a  single  period  of  the  line  voltage. 
If  N  denotes  the  number  of  samples  per  period,  then 

N-l 

aj[m]  =  ^  i[mN  +  k]  ■  Va,j[mN  -f  k]  (5.3) 

fc=0 


and  similarly. 


N-l 

^A'^]  —  i[mN  +  k]  ■  -I-  k] 

fc=0 


(5.4) 
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Each  spectral  envelope  coefficient  has  a  different  basis  sinusoid  associated  with  it;  for 
example,  calculation  of  a3[w]  involves  multiplying  i[n]  by  Da.aW,  where  Va,3[n]  consists 
of  discrete  time  samples  of  a  sinusoid  at  three  times  the  line  frequency,  with  its  phase 
locked  to  the  line  voltage.  Figure  5.10,  shown  below,  depicts  examples  of  basis  sinusoids. 
For  illustration  purposes,  these  sinusoids  are  sampled  at  only  3  bits,  while  the  prototype 
system  makes  use  of  10  bit  samples. 

Figure  5.11  shows  a  block  diagram  of  the  FPGA-based  spectral  envelope  prepro¬ 
cessor.  The  preprocessor  takes  the  discrete  time  samples  of  i{t)  and  v{t)  as  input,  denoted 
i[n]  and  v[n]  respectively,  where  n  indexes  the  samples,  and  produces  estimates  of  the 
spectral  envelope  coefficients  Oj  and  bj  of  the  current  i[n]. 

The  voltage  samples  v[n]  are  used  as  input  to  a  phase-locked  loop  (PLL),  which 
synchronizes  the  entire  computation  to  the  line  voltage.  As  noted  earlier,  the  computa¬ 
tion  process  is  synchronized  to  the  line  voltage  so  that  the  calculated  spectral  envelope 
coefficients  correspond  to  some  extent  to  meaningful  physical  quantities  (real  power,  re¬ 
active  power,  etc.).  The  output  of  the  PLL  is  sent  to  a  block  of  steering  logic  on  the 
FPGA  that  produces  the  address  for  the  basis  sinusoid  memory,  as  well  as  a  clear  signal 
for  the  accumulators.  The  basis  sinusoid  memory  consists  of  the  samples  of  the  various 
basis  sinusoids.  The  address  produced  by  the  steering  logic  specifies  a  single  sample 
time  of  a  single  basis  sinusoid.  The  sample  of  the  basis  sinusoid  that  is  retrieved  from 
the  basis  sinusoid  memory  is  then  multiplied  by  the  current  sample  i[n].  The  result  of 
this  multiplication  is  then  passed  through  a  demultiplexer  which  sends  the  result  to  the 
appropriate  accumulator  by  using  the  address  produced  by  steering  logic  to  determine 
which  spectral  envelope  coefficient  is  currently  being  calculated. 

There  is  one  accumulator  for  each  estimated  spectral  envelope  coefficient.  The 
accumulators  are  all  cleared  at  the  end  of  each  period  of  the  line  voltage  through  the 
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(a)  Basis  sinusoid  to  compute  ai 


(b)  Basis  sinusoid  to  compute  6i 


Figure  5.10:  Basis  Sinusoids.  This  figure  depicts  three  examples  of  basis  sinusoids,  used 
to  calculate  real  spectral  envelopes  of  in-phase  fundamental  frequency  content,  quadra¬ 
ture  fundamental  frequency  contents,  and  in-phase  third  harmonic  content,  from  top  to 
bottom,  respectively.  The  basis  sinusoids  shown  here  are  sampled  at  3  bits  for  illustration 
-  the  actual  prototype  spectral  envelope  preprocessor  uses  10  bit  samples. 
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Figure  5.11;  Preprocessor  Block  Diagram. 

use  of  the  clear  signal  produced  by  the  steering  logic.  For  every  sample  i[n],  the  address 
produced  by  the  steering  logic  will  select  each  of  the  basis  sinusoids  in  turn,  so  that  every 
sample  of  current  is  multiplied  by  the  appropriate  sample  of  each  of  the  basis  sinusoids. 

This  FPGA-based  implementation  provides  a  great  deal  of  flexibility.  For  example, 
the  subset  of  spectral  envelope  coefficients  that  are  being  estimated  can  be  changed  by 
altering  the  entries  in  the  memory  to  correspond  to  a  different  set  of  basis  sinusoids. 
This  implementation  is  also  efficient  in  terms  of  FPGA  resource  utilization.  It  uses  only 
a  single  PLL  and  a  single  multiplier,  as  opposed  to  previous  hardware  implementations 
that  often  used  multiple  PLLs  and/or  multipliers  [l],[10].  This  system  can  function  with 
only  a  single  multiplier  because  the  FPGA  is  capable  of  multiplying  each  sample  i[n] 
by  the  corresponding  sample  of  each  of  the  basis  sinusoids  and  forwarding  each  result 
to  the  appropriate  accumulator,  before  the  next  sample  i[n  +  1]  arrives.  The  multiplier 
consumes  substantial  logic  elements  on  the  FPGA.  It  consumes  24%  of  all  resources  used 
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by  the  envelope  preprocessor  and  13%  of  all  resources  used  by  the  complete  system.  By 
using  only  one  multiplier,  the  design  is  capable  of  fitting  in  a  small,  low-cost  FPGA. 

There  are  several  ways  to  configure  and  deploy  the  spectral  envelope  preprocessor 
for  any  given  application.  For  situations  where  the  voltage  waveform  is  relatively  sinu¬ 
soidal  and  “stiff,”  the  spectral  envelopes  of  current  can  be  interpreted  as  scaled  physical 
quantities  in  steady  state.  As  discussed  above,  under  these  assumptions,  the  Oi  envelope 
of  current  in  steady  state  corresponds  to  a  scaled  estimate  of  real  power  or  “P” .  The  bi 
envelope  of  current  corresponds  to  reactive  power  or  “Q” .  In  situations  where  the  voltage 
is  not  stiff  and/or  not  sinusoidal,  the  FPGA  could  be  tasked  to  also  compute  the  spectral 
envelopes  of  voltage  as  well  as  current.  This  more  complete  set  of  spectral  envelopes 
could  be  stored  or  transmitted  to  a  computation  platform  or  metering  instrument  that 
can  quickly  compute  estimates  of  real  or  reactive  power  or  other  quantities  of  interest. 
Alternatively,  the  FPGA  can  be  reconfigured  to  compute  quantities  like  real  and  reactive 
power.  In  practice,  it  has  been  found  that  the  basic  computation  of  the  spectral  envelopes 
of  current,  assuming  a  stiff  voltage  source,  to  yield  information  that  is  directly  useful  for 
energy  scorekeeping  and  demand-side  load  control  and  diagnostics,  e.g.,  see  [1]. 

5.3.4  CF  Controller 

The  purpose  of  this  FPGA  subsystem  is  to  store  spectral  envelope  data  on  an 
erasable  memory  like  a  CF  (Compact  Flash)  storage  card.  This  subsystem  is  capable  of 
storing  spectral  envelope  data  as  it  is  produced,  as  well  as  retrieving  the  spectral  envelope 
data  from  any  point  in  time,  on  demand.  To  interface  with  the  CF  card,  the  “True  IDE” 
interface  mode  is  used.  This  interface  mode  is  universally  supported  by  compact  flash 
storage  cards  and  it  allows  the  system  to  be  easily  adapted  to  interface  with  other  mass 
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storage  devices  that  use  the  IDE  interface  standard,  such  as  an  IDE  hard  drive.  While  it 
would  be  possible  to  impose  a  filesystem  on  the  CF  card  (i.e.  FAT32),  the  current  design 
treats  the  CF  card  as  a  single  large,  raw  block  of  storage,  for  simplicity.  Due  to  the 
low  data  rate  of  the  spectral  envelope  coefficients  (for  the  prototype  preprocessor  with  8 
spectral  envelopes,  each  stored  at  24  bits  of  resolution,  the  data  rate  is  2.8  KB/s),  even  a 
moderately  sized  CF  card  could  store  the  spectral  envelope  data  for  a  substantial  length 
of  time.  For  example,  in  for  the  prototype  system,  a  1  GB  CF  card  would  suffice  to  store 
data  for  approximately  4.3  days. 

5.3.5  WiFi  Controller 

This  subsystem  facilitates  the  transmission  of  spectral  envelope  coefficients  to  a 
PC  or  other  computation  platform  for  further  analysis  or  display.  It  makes  use  of  an 
802.11b/g  WiFi  transceiver  and  TCP/IP.  The  current  design  is  capable  of  supporting 
both  ad-hoc  and  access  point  (infrastructure)  networks. 

The  transmitted  data  is  retrieved  from  the  CF  card  as  needed.  The  WiFi  subsys¬ 
tem  can  operate  in  two  different  modes.  In  the  first  mode,  the  system  streams  spectral 
envelope  coefficients  as  they  are  generated.  In  the  prototype,  this  corresponds  to  a  data 
rate  of  2.8  KB/s.  In  the  event  of  a  momentary  interruption  in  the  connection  to  the  PC, 
the  system  will  automatically  buffer  data  from  the  last  successfully  transmitted  packet 
and  resume  transmission  from  that  point  when  a  connection  is  reestablished.  The  system 
will  then  send  data  at  the  highest  available  transmit  speed  (54  Mb/s  for  802.  llg),  until 
the  system  catches  up  to  the  freshly  produced  spectral  envelope  data.  In  the  second 
mode,  an  application  on  the  PC  requests  data  by  specifying  a  range  of  time;  the  sys¬ 
tem  then  transmits  all  data  from  the  desired  range  of  time,  at  the  maximum  possible 
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transmission  speed. 


5.4  Flexibility 

The  design  presented  above  is  just  one  example  of  an  FPGA-based  load  monitoring 
interface.  The  modularity  of  the  design,  and  the  versatility  of  FPGAs,  makes  it  simple  to 
change  the  transmission  system,  for  example,  to  wired  ethernet  (IEEE  802.3)  or  Bluetooth 
(IEEE  802.15.1)  or  ZigBee,  or  to  change  the  storage  system  to,  for  example,  a  microSD 
card.  An  FPGA  permits  the  interconnection  of  a  wide  variety  of  different  subsystems  to 
form  a  complex  utility  monitoring  system.  Even  a  small,  low-cost  FPGA  is  capable  of 
implementing  both  the  spectral  envelope  preprocessor  as  well  as  the  required  interface 
logic  to  control  the  various  subsystems.  Thus,  an  FPGA  can  serve  as  the  backbone  of 
an  inexpensive,  complete  utility  or  load  monitoring  system. 


5.5  Prototype  Results 

The  FPGA-based  system  discussed  above  calculates,  stores,  and  transmits  spec¬ 
tral  envelope  data.  Figure  5.12,  shown  below,  shows  a  picture  of  the  prototype  hardware. 

To  make  use  of  this  data,  a  monitoring  or  control  system  typically  includes  a 
subsystem  to  receive  and  use  the  spectral  envelope  data.  For  example,  a  homeowner  could 
use  a  personal  computer  to  collect  spectral  envelope  data  from  the  FPGA  preprocessor 
installed  near  or  in  a  circuit  breaker  panel.  The  prototype  includes  a  PC-based  software 
application  that  can  interface  with  the  FPGA-based  preprocessor  via  wired  or  wireless 
communication  channels,  retrieve  spectral  envelopes,  and  display  spectral  envelope  data. 
Using  this  spectral  envelope  data,  the  PC-side  application  can  disaggregate  the  operating 
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Figure  5.12:  Prototype  hardware.  This  is  a  picture  of  the  prototype  FPGA-based  system. 

schedule  of  individual  loads  from  measurements  made  on  an  aggregate  power  feed  serving 
multiple  loads.  The  application  is  self-training  and  identifies  loads  in  essentially  real-time. 
Screenshots  of  this  program  are  shown  in  Figure  5.13  and  Figure  5.14. 

This  software  communicates  via  TCP/IP  with  the  FPGA-based  preprocessor.  The 
software  can  retrieve  any  subset  of  recorded  data,  as  well  as  issue  commands,  such  as 
changing  the  sampling  resolution  of  the  ADC.  Once  spectral  envelope  data  is  retrieved, 
this  software  makes  use  of  the  Expectation-Maximization  (EM)  algorithm  [8]  to  classify 
or  recognize  the  operation  of  individual  loads. 

This  software  is  only  one  example  of  the  many  possible  ways  to  use  spectral 
envelope  data.  Other  software  applications  that  make  use  of  spectral  envelope  data  could 
be  developed  (a  system  that  uses  this  data  to  control  a  set  of  generators  in  a  micro-grid 
is  currently  in  development).  Data  could  be  retrieved  by  a  web  application  that  displays 
a  live  stream  of  data  on  a  webpage.  Other  embedded  systems  could  communicate  with 
the  FPGA-based  system  and  use  the  retrieved  spectral  envelope  data  to  control  electrical 
loads. 
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Figure  5.13:  Screenshot.  This  figure  shows  a  screenshot  of  the  prototype  non-intrusive 
monitoring  software  in  operation.  There  are  two  plots  in  the  figure  that  display  spectral 
envelope  data.  The  upper  plot  displays  real  power  and  the  lower  plot  displays  third 
harmonic  content.  The  spectral  envelope  data  corresponds  to  the  light  bulb  and  motor 
whose  raw  current  values  are  shown  in  Figure  5.1.  The  lower  left  section  of  the  screenshot 
shows  the  output  of  the  classifier,  which  has  correctly  identified  both  of  the  loads. 


Figure  5.14:  Screenshot.  This  figure  shows  a  screenshot  of  the  prototype  non-intrusive 
monitoring  software  in  operation.  As  in  Figure  5.13,  the  upper  plot  displays  real  power 
and  the  lower  plot  displays  third  harmonic  content.  The  data  corresponds  to  the  same 
light  bulb  and  motor  as  Figure  5.13,  with  the  difference  being  that  now  the  motor  is 
turned  on  while  the  light  bulb  is  on.  As  shown  in  the  lower  left  corner  of  the  screen,  the 
system  still  classifies  both  devices  correctly. 
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5.6  Applications 


The  FPGA  preprocessor  can  provide  a  turn-key  component  for  creating  all  sorts 
of  utility  monitoring,  energy  score-keeping,  and  diagnostic  applications  for  all  sorts  of 
systems.  The  preprocessor  is  relatively  simple  compared  to  microprocessor  or  DSP- 
based  data  acquisition  systems.  The  concepts  and  hardware  illustrated  here  could  be 
incorporated  into  individual  loads,  circuit  breakers,  or  circuit  breaker  panels  to  provide 
energy  consumption  information  for  both  monitoring  and  control. 

The  simplification  in  data  storage  and  transmission  bandwidth  requirements  af¬ 
forded  by  the  FPGA  can  be  extended  to  other  domains  and  monitoring  problems.  For 
example,  it  is  possible  to  extend  the  non-intrusive  monitoring  concept  beyond  the  realm 
of  electrical  distribution.  A  single  acoustic  sensor  could  be  used  to  monitor  the  flow  of 
water,  for  example,  in  a  main  water  service  to  a  set  of  rooms  in  a  building.  Finger¬ 
print  acoustic  signatures  can  be  developed  that  permit  recognition  of  hydraulic  loads  or 
events  in  the  water  distribution  system.  This  acoustic  data  is  not  “line  locked”  to  any 
particular  “utility  frequency.”  However,  much  like  a  voice  signal,  acoustic  data  can  be 
described  by  simplifying  expressions,  e.g.,  the  coefficients  of  a  time  series,  also  known  as 
linear  predictor  coefficients  (LPGs)  [18].  The  FPGA  could  be  tasked  to  compute  these 
LPCs  and  transmit  them,  again  providing  a  significant  bandwidth  reduction  for  storage 
or  transmission.  Other  applications  may  also  be  possible. 

The  approach  demonstrated  in  this  chapter  permits  a  flexible  trade-off  between 
the  hardware  installed  proximal  to  a  monitored  device  or  collection  of  devices  and  the 
transmission  bandwidth  to  and  remote  computation  capability  at  a  distal  monitoring 
or  information  gathering  system.  An  FPGA  like  the  one  described  here  could  serve 
as  a  central  coordinator  for  gathering,  processing,  and  transmitting  all  sorts  of  utility 
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information,  including  simultaneous  monitoring  of  electrical,  water,  and  gas  services. 
This  type  of  monitoring  can  support  home  or  building  level  energy  conservation  and 
diagnostics  efforts.  It  might  also  be  useful  for  coordinating  the  operation  of  generation 
and  the  scheduling  of  demand  on  micro-grid  power  distribution  systems  or  similar  power 
distribution  systems  on  transportation  systems. 
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Chapter  6 


Conclusion 


The  techniques  illustrated  in  this  thesis  can  be  applied  to  a  wide  range  of  signal 
processing  problems,  including  non-intrusive  load  monitoring.  The  FPGA-based  spec¬ 
tral  envelope  preprocessor,  discussed  in  Chapter  5,  provides  an  inexpensive,  accurate, 
and  convenient  platform  for  collected  a  variety  of  useful  data  about  a  collection  of  elec¬ 
trical  loads.  This  information  can  be  used  for  a  variety  of  power  monitoring  and  energy 
scorekeeping  tasks,  as  well  as  to  diagnose  problems  with  individual  electrical  loads.  The 
algorithms  presented  in  Chapters  2-4  can  be  applied  to  enhance  the  capabilities  of  a 
NILM  system. 

Chapter  4  presented  an  algorithm  that  could  identity  a  single  electrical  load  from 
a  collection  of  electrical  loads,  by  examining  a  subset  of  the  spectral  envelopes  of  the 
current  drawn  by  the  unknown  load.  This  allows  the  data  collected  by  the  FPCA-based 
system  of  Chapter  5  to  be  used  to  non-intrusively  monitor  a  collection  of  electrical  loads 
and  determine  when  each  load  is  turned  on  and  off,  as  well  as  how  much  power  each  load 
consumed  at  any  point  in  time. 
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Chapter  3  considered  the  problem  of  using  knowledge  of  one  subset  of  spectral 
envelope  values  to  estimate  another  subset  of  spectral  envelope  values,  for  an  appropri¬ 
ately  constrained  class  of  waveforms.  A  simple  but  numerically  unstable  algorithm  to 
solve  this  problem  was  first  presented,  followed  by  a  refined  approach  that  avoided  nu¬ 
merical  instability  to  exploiting  properties  of  cyclotomic  fields.  In  a  NILM  environment, 
many  simultaneously  operating  loads  may  draw  currents  that  have  partially  overlapping 
harmonic  content.  The  algorithm  presented  in  this  section  would  allow  the  estimation  of 
all  spectral  envelopes  of  each  individual  load  by  using  the  band  of  harmonic  content  that 
is  unique  to  that  load  to  estimate  the  overlapping  portions  of  harmonic  content. 

Chapter  2  examined  the  problem  of  calculating  the  DFT  of  a  quantized  signal. 
An  algorithm  was  presented  that  used  the  structure  of  the  mapping  between  regions  of 
frequency  space  and  quantized  current  to  accurately,  and  efficiently,  estimate  the  true 
spectral  envelope  values  of  a  measured  current.  This  algorithm  is  invaluable  when  dealing 
with  data  produced  by  the  FPGA-based  NILM  of  chapter  5  as  it  allows  accurate  estimates 
of  true  power  consumption  to  be  made  from  the  quantized  data  collected  by  that  system. 

The  algorithms  of  Chapters  2-4  can  be  applied  to  a  variety  of  discrete-time  signal 
processing  tasks  that  involve  the  computation  of  the  DFT  of  a  signal.  The  algorithm 
of  chapter  2  can  be  applied  to  any  situation  in  which  one  desires  accurate  computation 
of  the  DFT  of  a  signal,  but  is  only  provided  with  a  coarsely  quantized  version  of  that 
signal.  The  algorithms  in  chapter  3  can  be  applied  to  a  variety  of  estimation  problems 
where  the  constraints  can  be  written  with  coefficients  in  a  cyclotomic  field.  Finally,  the 
classification  algorithms  presented  in  chapter  4  could  be  applied  in  other  classification 
contexts  when  the  objects  being  classified  have  distinct  harmonic  signatures. 
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Appendix  A 


Matlab  Code  for  DFT  Accuracy 
Improvement 


This  appendix  consists  of  Matlab  code  that  implements  the  algorithm  discussed  in 
Chapter  2.  There  are  3  functions:  findFirst Vertex  which  finds  a  single  vertex  of  a  region 
R,  findNeighbors  which  finds  all  neighbors  of  vertex  x  of  region  R  and  findAllVertices 
which  finds  all  vertices  of  region  R. 

function  [x]=findFirstVertex(y,A,B) 
err  =  10''(— 4) ; 
x=y ; 

[  numConst ,  numHarms]  =  s  i  z  e  (A)  ; 

S=zeros  (0  , numHarms)  ; 

D=zeros (0,1) ! 

C=zeros (0,1); 
s=rand(l  , numHarms)  —.5; 
for  i=l:numHarms 
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dists=zeros(l  ,numConst)  ; 
for  j  =  1 :  numConst 

if  (isempty(find  (G=j  ,1)  )) 
init=sum(A(  j  ,  : )  .  *x)  ; 
adj=sum(A(j  ,  : )  .  s  )  ; 
dists  ( j  )=(B(  j  )-init  )/adj  ; 
if  (dists  ( j  )<=err  ) 
dists  ( j  )=:inf  ; 

end 

else 

dists  ( j  )=inf  ; 

end 

end 

[dummy,  newConstraint]=min (  dists  )  ; 
x=x+dists  (newConstraint)*s; 
i  f  ( i  <numHarms ) 

C=vertcat  (C,  newConstraint )  ; 

S=vertcat  (S  ,A( newConstraint  ,  : )  )  ; 

D=vertcat  (D,  [B(  newConstraint )] )  ; 

[ currRows  , dummy]  =  size  (S)  ; 

s=(  vertcat  (S  ,  rand  (numHarms— currRows  ,numHarms) )  \ 
vertcat  (D,  zeros  (numHarms— currRows  ,  1 )  ) )  x  ; 

end 
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function  [L]  =  findNeighbors(y,A,B) 

err  =10'' ( —  6)  ; 

[  numCons  ,  numHarms]  =  s  i  z  e  (A)  ; 

L=zeros  (0  , numHarms)  ; 
satCons=L ; 
unsatCons=L ; 
satD=zeros  (0,1)  ; 
unsatD=zeros  (0 ,1)  ; 
isUpper=zeros  (0,1); 
for  i  =  1 :  numCons 

i  f  (  abs  (sum(A(  i  , ; )  .  )— B(  i )  )<err  ) 

sat Cons=ver teat  (sat Cons  ,A(i  ,:)  )  ; 
satD=vertcat  (satD  ,B(  i ) )  ; 

isUpper=vertcat  ( isUpper  ,  ( i<=numCons/2) )  ; 


else 


unsatCons=vertcat  (unsatCons  ,A(  i  , : )  )  ; 
unsatD=vertcat  (unsatD  ,B(  i ) )  ; 


end 

dists=zeros(l  ,  length  (unsatD ) )  ; 
for  i  =1:  length  ( satD  ) 

i  f  ( i  s  i  n  f  ( satD  ( i ) )  ==0) 

for  j=i +1:  length  ( satD) 

if  ( isinf  (satD  ( j )  )==0  (  (sum(  abs  ( satCons  ( i  , : ) — 

satCons(j  ,:)))<err  abs  ( satD  ( i )— satD  ( j )  )<err 
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)  II  (sum(  abs  (  satCons  ( i  , ;  )+satCons  ( j  , : )  )  )<err 
&&  abs  ( satD  ( i  )+satD  ( j )  )<err  ) ) ) 
satD  ( j  )=inf  ; 

end 

end 

end 

end 

goodRows=find  (1—  i s i  nf  ( satD ) )  ; 
satCons=satCons  (goodRows  , : )  ; 
satD=satD  (goodRows)  ; 
isUpper=isUpper  (goodRows)  ; 
for  i=l:length  (satD) 
tempCons=satCons  ; 
tempCons  ( i  , : )  =r and  ( 1  ,  nuniHarms )  ; 
tempD=satD  ; 
tempD(  i )  =0; 

dir  =  (tempCons\tempD)  y ; 

i  f  ( (sum(  dir  .  *  sat  Cons  (i  ,:)))*(2*isUpper(i)  — 1)>0) 
dir=-dir  ; 

end 

for  j  =1:  length  ( unsatD ) 

init=sum(  unsatCons  ( j  ,:)  .*y); 
adj=sum(  unsatCons  ( j  , : )  .*  dir  )  ; 
dists  ( j )— (unsatD  (j )— init )  /adj  ; 
i  f  (  dists  ( j  )<err  ) 
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dists  ( j  )=inf  ; 


end 

end 

[newDist  ,  newConstraint]=min(  dists  )  ; 

candL=y+newDist*  dir  ; 

candBad=0; 

sL=size  (L)  ; 

for  j=l:sL(l) 

if  (sum(  abs  (L(  j  , : )  — candL )  )<err  ) 
candBad  =  l; 

end 

end 

if  (candBad==0) 

L=vertcat  (L ,  y+dists  (  newConstraint )  *  dir  ) 

end 
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function  [V]=findAllVertices(y,A,B) 
err  =10'' (  —  6)  ; 

[  numCons  ,  numHarms]  =  s  i  z  e  (A)  ; 

V=zeros  (0  , numHarms)  ; 

H=[  find  First  Vert  ex  (y  ,A,B)  ]  ; 
sH=size  (H)  ; 
while  (sH ( 1 )  >0) 

V=vertcat  (V,H)  ; 

L=zeros  (0  , numHarms)  ; 
for  i=l:sH(l) 

newL=findNeighbors  (H(  i  ,:)  ,A,B)  ; 
sNL=size  (newL)  ; 
sV=size  (V)  ; 
for  j=l:sNL(l) 

if(sV(l)==0  II  min(sum(  abs  (V— repmat  (newL(  j  ,:)  ,sV 
(1)  ,1) )  ,2)  )>err) 

L=vertcat  (L,newL(j 
sL=siz e  (L)  ; 

end 

end 

end 

H=L; 

sH=size  (H)  ; 

end 
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Appendix  B 


GP/PARI  Code  for  cross  estimation 


The  code  in  this  appendix  was  developed  in  collaboration  with  Warit  Wichakool. 
This  appendix  consists  of  GP/PARI  code  that  implements  the  algorithms  discussed  in 
chapter  3.  There  are  5  functions:  nfrref.gp  computes  the  RREF  of  a  matrix  with  entries 
that  are  elements  of  a  number  field,  ntt.gp  computes  the  number  theoretic  transform, 
fntt.gp  computes  a  “fast”  number  theoretic  transform,  rnul.gp  multiplies  polynomials 
using  the  NTT  and  div.gp  divides  polynomials  using  the  NTT. 

nfrref  (A,  OptionRowEchelonForm)= 

{ 

local  (numRows,numCols  ,  colindex  ,  rowindex  ,  rowCounter  ,  found  ,M,  temp  , 
pivotVal  ,k,  INeigh)  ; 

hPA; 

numRows=length  (A[  ,1])  ; 
numCols=length  (A)  ; 
colIndex  =  l; 
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rowCounter  =  l; 


while  ((  colIndex<=numCols)  &&  (rowCouiiter<=numRows)  , 

rowIndex=rowCounter  ; 
found=0; 

while  ( rowIndex<=aiumRows  &&  [found, 

if  (M[ rowindex  ,  collndex]==0,rowlndex++;,found  =  l;)  ; 

); 


i  f  ( found  , 

if  ( rowCounter !  =  rowIndex  , 

temp=M[  rowindex  ,]  ; 

M[ rowindex  ,]  =M[ rowCounter  ,]; 
M[  rowCounter  ,]  =  temp  ; 

); 


pivotVal=M[rowCounter  ,  colindex  ]  ; 

M[ rowCounter  ,]  =  M[rowCounter  ,]  /  pivotVal ; 

for  (k=l ,numRows , 

i f  ( (  OptionRowEchelonForm  &&  ({k  =  1  kk. 

rowCounter  !=  1)  ||  k  >  rowCounter))  ||  ( 

OptionRowEchelonForm  kk  k!=rowCounter )  , 
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lNeigh=M[k  ,  colindex  ]  ; 

M[k  ,]  =M[k,]  — M[  rowCounter  ,]  *  INeigh  ; 


); 


); 

rowCounter++; 


); 

colIndex++; 

); 

return  (M)  ; 

} 
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ntt  (n  ,  p  ,  z  ,  v)  = 

{ 

local  ( index  ,  k  ,  vmod , zmod  ,  vout )  ; 

vout  =  vector (n) ; 
vmod  =  Mod  ( v  ,  p )  ; 
zmod  =  Mod  ( z  ,  p )  ; 

for(index  =  1 , length ( vout ) , 
for  (k  =  0  ,  n  — 1, 

if  (v[k  +  l]  ,vout  [index]  +=  vmod  [k  +  l]*zmod"  lift  (Mod 
((index-l)*k,(p-l)))  ;)  ; 

); 

); 

return ( 1 i ft  ( vout ) ) ; 

} 
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fntt (n,p,z ,v, split  )= 

{ 

local  (nl,  n2  ,  m,  mhat ,  bx ,  row,  col,  zl  ,  z2  ,  modzl ,  modz2  ,  kl  , 
k2  ,  k  ,  modz  ,  vout  ,  vtemp  ,  THRESHOLD)  ; 

THRESHOLD  =  32; 

if  (split  [n]  =  0  II  n  <  THRESHOLD, 
return(ntt(n,p,z,v))  ;  , 

nl  =  split  [n][l]; 
n2  =  split  [n][2]; 
modz=Mod  ( z  ,  p )  ; 
modzl  =  Mod(z  ^  (n2)  ,p)  ; 
modz2  Mod(  z  "  (nl )  ,p)  ; 
zl  =  lift  (modzl )  ; 
z2  =  lift  (modz2 )  ; 

m  =  matrix  (n2  ,  nl  ,  row  ,  col  ,  v  [  (  row  — l)*nl  +  (col  —  1)  +  1] )  ; 
mhat  =  matrix  ( nl  ,  n2 )  ; 

for  (k  =  1  ,nl  , 

mhat[k,]  =  1  i  ft  (Mod(  fntt  (n2  ,  p  ,  z2  ,m[  ,  k]  ,  split  )  ,p) 

); 

); 
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vout  =  vector (n) ; 
vtemp  =  matrix  (nl  ,  n2 )  ; 


for(kl  =  0,  nl  — 1, 

for  (k2  =  0  ,  n2  — 1, 

mhat  [kl  +  l,k2  +  l]  =lift  (mhat  [kl  +  l,k2  +  l]* 
modz"  (kl*k2) )  ; 

); 


); 

for  (k2  =  0  ,  n2  — 1, 

vtemp[,k2  +  l]  =  1  i  f  t  (Mod(  fntt  (nl ,  p  ,  zl  ,  mhat  [  ,  k2 
+  1] ,  split )  ,p) )  ~; 

);  /*  end  for(k2  =  0,  ...)  */ 

for(kl  =  0.  nl— 1, 

for  (k2  =  0  ,  n2  — 1, 

vout[n2*kl  +  k2  +  1]  =  vtemp  [kl  +  l,k2  +  l] 

); 

); 


return  ( vout )  ; 
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} 
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mul  ( a  ,  b  ,  polyTemp  ,  n ,  ninv  ,  p  ,  z  ,  zinv  ,  sp  )  = 

{ 

local(acoeff  ,bcoeff  ,ahat  ,  bhat  ,  chat  ,  index )  ; 


acoeff  =  Vec  (  1  i  f  t  ( a) )  ; 
bcoeff  =  Vec  (  1  i  f  t  (b) )  ; 

ahat  =  fntt  (n  ,  p  ,  z  ,  concat  (  vector  (n— length  (  acoeff )),  acoeff ),  sp )  ; 
bhat  =  fntt  (n  ,  p  ,  z  ,  concat  (  vector  (n— length  (  bcoeff )),  bcoeff ),  sp )  ; 
chat  =  vector (n) ; 

for(index  =  l,n,chat  [index  ]=lift  (Mod(  ahat  [  index  ]  *  bhat  [  index  ]  ,  p) ) ) 

d  =  fntt  (n  ,  p  ,  zinv  ,  chat  ,  sp  )  ; 
c  =  1  i  f  t  (Mod(  ninv=)=d  ,  p) )  ; 
c  =  vecextract (c 1 . .  —  2” ) ; 

c  =  lift  (Mod(Vec(  lift  (Mod(Pol(c)  , polyTemp) ))  ,p))  ; 
c  =  apply  (x->if  (x  >  (p-l)/2,  x-p,x),c); 

return  (Mod (  Pol  (  c)  ,  polyTemp) )  ; 

} 
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div  (a  ,  b  ,  polyTemp  ,ii ,  ninv  ,  p  ,  z  ,  zinv  ,  sp  )= 

{ 

local  (  acoeff  ,  bcoeff  ,  ahat  ,  bhat  ,  chat  ,  index  ,  c,  d)  ; 


acoeff  =  Vec(  lift  (a) )  ; 
bcoeff  =  Vec(  lift  (b) )  ; 

ahat  =  fnt t  (n  ,  p  ,  z  ,  concat  (  vector  (n— length  (  acoeff )),  acoeff ),  sp )  ; 
bhat  =  fntt  (n  ,  p  ,  z  ,  concat  (  vector  (n— length  (  bcoeff )),  bcoeff ),  sp )  ; 
chat  =  vector (n) ; 

for(index  =  l,n,chat  [index]=lift  (Mod(  ahat  [  index  ]  /  bhat  [index  ]  ,  p) ) ) 

d=  f  ntt(n,p, zinv, chat, sp); 
c  =  1  i  f  t  (Mod(  ninv*d  ,  p) )  ; 
c  =  concat ( c  ,  [ 0 ] )  ; 

c  =  1  i  f  t  (Mod(  Vec  (  1  i  f  t  (Mod(  Pol  (  c)  ,polyTemp) ) )  ,p) )  ; 
c  =  apply  (x->  if  (x  >  (p-l)/2,  x-p,x),c); 

return  (Mod(Pol(c)  ,polyTemp) )  ; 

} 
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Appendix  C 


Verilog  Code  for  FPGA-Based 
Spectral  Envelope  Preprocessor 


This  chapter  includes  all  Verilog  code  used  to  implement  the  FPGA-Based  Spec¬ 
tral  Envelope  Preprocessor  discussed  in  Chapter  5. 

Illllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll 

//Top  level  module  of  PowerMon  // 

//  // 

//Version  1.6 


// 

/ /  Author  :  Zack  Remscrim 

// 

lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll 


120 


module  PowerMon( 


// clock 
OSC_25 , 

//led 

OLED, 

//compact  flash 

CF.CSO,  CF.CSl,  CF_A,  CFJ),CF_IO_R,CFJO_W,CF_REG,CFJlESET, 
CFJNTRQ, 

//ADC 

ADCJ3B,  ADC_CS,  ADC_RD,  ADCAVR,  ADCJNTRQ,  ADC.CLK, 
//Ethernet 

ETHEIERXD ,  ErHER.RKD ,  ETHER-RESET 

); 


/  /  clock 

input  OSC_25;  //25  MHz  clock 


//led 

output  OLED;  //output  led,  active  high 


//compact  flash 
output  CF_CS0 ; 
output  CF_CS1 ; 
output  [2:0]  CF_A; 
inout  [15:0]  CF_D; 


//compact  flash  CSO ,  active  low 
//compact  flash  CSl ,  active  low 
//compact  flash  address 
//compact  flash  databus 
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output  CF_IO_R; 
output  CF_IO_W ; 
output  CF_REG; 
output  CFJRESET ; 
input  CFJNTRQ; 


//compact  flash  read,  active  low 
//compact  flash  write,  active  low 
//not  used,  always  tie  to  VCC 
// reset 

//interrupt  request 


//ADC 


inout  [7:0]  ADC_DB; 
output  ADC-CS; 
output  ADC_RD; 
output  ADC-WR; 
input  ADC JNTRQ ; 
output  ADC_CLK; 


//ADC  databus 

//ADC  chip  select  ,  active  low 
//ADC  read,  active  low 
//ADC  write  ,  active  low 
//ADC  interrupt  request  ,  active  low 
//ADC  clock 


/  /Ethernet 
input  ETHERJRXD; 
output  ETHER.TXD; 
output  ETHERJRESET; 


//serial  input  line  from  ethernet  module 
//serial  output  line  to  ethernet  module 
//ethernet  reset  ,  active  low 


//Begin  global  wires  and  regs 
wire  PLED ; 

wire  adcEnable ;  //enables  adc  ,  active  high 
wire  cfEnable ;  //enable  cf  controller  ,  active  high 
wire  sysClk  ;  //system  clock 
wire  [7:0]  microData; 
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wire  microWrite; 
wire  debug  ; 

//End  global  wires  and  regs 

//Begin  module  instantiations 

//resetGen  module  and  connections 
wire  globalReset  ; 

resetGen  aResetGen  (sysClk  ,  globalReset)  ; 

//adc  buffer 

wire  [15:0]  adcBufferDataIn  ,  adcBufferDataOut  ; 
wire  adcBufferWr  ,  adcBufferRd  ,  adcBufferInClk  ,  adcBufferOutClk  , 
adcBufferEmpty  ,  adcBufferFull ; 

adcBuffer  aAdcBuffer  (  adcBufferDataIn  ,  adcBufferWr  ,  adcBufferRd  , 
adcBufferInClk  ,  adcBufferOutClk  ,  adcBufferDataOut  , 
adcBufferEmpty  ,  adcBufferFull)  ; 

//adc  controller 
wire  [7:0]  ADCJDB; 

wire  ADC-CS ,  ADC  JED ,  ADC_WR,  ADC  JNTRQ ,  ADC_CLK ; 
adcController  aAdcController  ( OSC-25  ,  globalReset  ,  adcEnable  , 
ADC_DB ,  ADC_CS ,  ADC-RD ,  ADCJWR,  ADC  JNTRQ ,  ADC.CLK , 
adcBufferDataIn  ,  adcBufferWr  ,  adcBufferInClk)  ; 
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//ethernet  buffer 

wire  [7:0]  etherBufferln  ,etherBufferOut  ; 
wire  etherBufferWr  ,  etherBufferRd  ,  etherBufferFull  , 
etherBufferEmpty  ; 
wire  [9:0]  etherUsedWords  ; 

etherFifo  aEtherFifo  (  etherBufferln  ,  etherBufferWr  ,  etherBufferRd 
sysClk  ,  etherBufferOut  ,  etherBufferFull  ,  etherBufferEmpty  , 
etherUsedWords)  ; 

//ethernet  controller 
wire  etherSendRaw ; 

ether  Controller  aEtherController  (~  sysClk  ,  globalReset  jETHERJlXD 
,ETHER.TXD,ETHERJIESET,  etherBufferRd  ,  etherBufferOut  , 
etherBufferEmpty  ,  etherUsedWords  ,  etherSendRaw  ,  debug)  ; 


//compact  flash  read  buffer 

wire  [15:0]  cfBufferDataIn  ,  cfBufferDataOut  ; 
wire  [7:0]  cfBufferAddrIn  ,  cfBufferAddrOut  ; 
wire  cfBufferWr  ; 

/  /  cfBuffer  aCfBuffer  (  cfBufferDataIn  ,  cfBufferWr  ,  cfBufferAddrIn  , 
cfBufferAddrOut  ,  ~  sysClk  ,  cfBufferDataOut )  ; 

//compact  flash  write  buffer 

wire  [15:0]  cfWriteBufferDataIn  ,  cfWriteBufferDataOut  ; 
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wire  [7:0]  cfWriteBuffer Addrin  ,  cfWriteBufferAddrOut  ; 
wire  cfWriteBufferWr  ; 

cf Buffer  bCfBuffer  (  cfWriteBufferDataIn  ,  cfWriteBufferWr  , 
cfWriteBufferAddrIn  ,  cfWriteBufferAddrOut  ,~sysClk  , 
cfWriteBufferDataOut )  ; 

//compact  flash  controller 

wire  CF_CS0 ,  CF.CSl ,  CF_IO_R ,  CFJO_W ,  CFJIEG ,  CFJIESET ,  CF JNTRQ , 
cfBusy  ,  cfLoadWrite  ,  cfIsReadMode  ,  tempDebug  ; 
wire  [2:0]  CF_A; 
wire  [15:0]  CFJ3 ; 
wire  [3:2]  cfDebug ; 
wire  [7:0]  cfLoadDataIn  ; 

cfController  aCfController  (~ sysClk  ,  globalReset  ,  cfEnable  , 
microWrite  ,  microData  ,  CF_CS0  ,  CF_CS1 ,  CF_A ,  CF_D ,  CF_IO-R , 
CF_IO_W,  CFJIEG,  CFJIESET,  CF  JNTRQ,  cfDebug  ,  cfBufferDataIn  , 
cfBufferAddrIn  ,  cfBufferWr  ,  cfBusy  ,  cfIsReadMode  , 
cfWriteBufferAddrOut  ,  cfWriteBufferDataOut  , tempDebug); 

//prep 

wire  [7:0]  prepV  ,  prepi  ; 

wire  prepNewSample  ,  prepNewEnvelope  ,  prepDebug  ; 
wire  [127:0]  prepEnvelopes  ; 

prep  aPrep  ( ~  sysClk  ,  globalReset  ,  prepV  ,  prepi  ,  prepNewSample  , 
etherSendRaw  ,  prepEnvelopes  ,  prepNewEnvelope  ,  prepDebug)  ; 
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//main  controller 

mainController  aMainController  (  sysClk  ,  globalReset  , 

adcBufferDataOut  ,  adcBufferRd  ,  adcBufferEmpty  ,  adcBufferFull  , 
cfBuffer AddrOut  ,  cfBufferDataOut  ,  cfWriteBufferDataIn  , 
cfWriteBufferWr  ,  cfWriteBuffer  Addrin  ,  microData  ,  microWrite  , 
cfIsReadMode  ,  cfBusy  ,  cfEnable  ,  etherBufferIn  ,  etherBufferWr  , 
etherBufferFull  ,prepV,prepI  ,  prepNewSample  ,  prepEnvelopes  , 
prepNewEnvelope  ,  adcEnable)  ; 

//End  module  instantiations 

//Begin  global  assignments 
//assign  adcEnable=~globalReset  ; 
assign  OLEEfc=debug ; 
assign  sysClk=OSC_25 ; 
assign  adcBufferOutClk=~sysClk  ; 

//assign  ETHERJlESET=l’bl ; 

//End  global  assignments 
endmodule 

//this  module  is  responsible  for  managing  the  global  reset 
signal 
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//it  assures  that  reset  is  asserted  for  the  first  64k  clock 
cycles 

module  resetGen  (  elk  ,  resetOut )  ; 
input  elk;  //system  elk 

output  resetOut;  //global  reset,  active  high 

parameter  numCycles  =  64000; 
reg[15:0]  cycleCount  ; 
wire  resetOut  ; 

assign  resetOut=(cycleCount<numCycles)  ; 
always  @(posedge  elk)  begin 

cycleCount  <=(cycleCount<numCycles)  ?  cycleCount +1 ’bl  : 
cycleCount  ; 

end 

endmodule 

//this  module  is  responsible  for  overall  system  control 
//it  takes  data  from  adc  fifo  ,  moves  it  to  cf  buffer,  and  uses 
cfController  to  write  to  cf  card 
module  mainController  (  elk  ,  reset  ,  adcBufferDataOut  ,  adcBufferRd  , 
adcBufferEmpty  ,  adcBufferFull  ,  cfReadBufferAddrOut  , 
cfReadBufferDataOut  ,  cfWriteBufferDataIn  ,  cfWriteBufferWr  , 
cfWriteBuffer  Addrin  ,microData  ,microWrite  ,  cfIsReadMode  ,  cfBusy  , 
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cfEnable  ,  etherBufferDataIn  ,  etherBufferWr  ,  etherBufferFull  , 
prepV  ,  prepi  ,  prepNewSample  ,  prepEnvelopes  ,  prepNewEnvelope  , 
adcEnable )  ; 

input  elk;  //system  elk,  25MHz 

input  reset;  //global  reset; 

input  [15:0]  adcBufferDataOut ;  //output  data  from  adc  buffer 
input  adcBufferEmpty ;  //high  when  adc  buffer  is  empty 

input  adcBufferFull  ;  //high  when  adc  buffer  is  full 

input  [15:0]  cfReadBufferDataOut ;  //output  data  from  cf  read 
buffer 

input  cfBusy ;  //high  when  cf  card  is  busy 

input  etherBufferFull;  //high  when  ethernet  buffer  is 

full 

input  [127:0]  prepEnvelopes;  //prep  envelopes 

input  prepNewEnvelope;  //high  for  one  cycle  when  a  new 

set  of  prep  envelopes  is  available 

output  adcBufferRd ;  //read  line  for  adc  buffer  ,  active 

high 

output  [7:0]  cfReadBufferAddrOut  ;  //address  for  read  buffer  of 
cf  card 

output  [15:0]  cfWriteBufferDataln ;  //input  date  to  cf  write 
buffer 

output  cfWriteBufferWr ;  //write  line  for  cf  write  buffer 

,  active  high 
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//address  for  write  buffer 


output  [7:0]  cfWriteBuffer Addrin  ; 
of  cf  card 

output  [7:0]  microData;  //databus  to  other  modules 

output  microWrite;  //write  line  to  other  modules, 

active  low 

output  cfIsReadMode ;  //I  to  read  from  cf  Card,  0  to 

write  to  cf  card 

output  cfEnable;  //enable  signal  to  cf  controller, 

active  high 

output  [7:0]  etherBufferDataIn  ;  //input  data  to  ethernet 
buffer 

output  etherBufferWr ;  //write  line  for  ethernet  buffer, 

active  high 

output  [7:0]  prepV ;  //next  voltage  sample  for  prep 

output  [7:0]  prepi  ;  //next  current  sample  for  prep 

output  prepNewSample ;  //asserted  for  one  cycle  to  inform 

prep  module  that  a  new  sample  is  ready  ,  active  high 
output  adcEnable ;  //asserted  to  enable  adc 

//output  led  ; 

//reg  led; 

reg  adcBufferRd  ,  cfWriteBufferWr  ,  microWrite  ,  cfIsReadMode  , 
cfEnable  ,  etherBufferWr  ,  prepNewSample  ,  adcEnable  ; 
reg  [7:0]  cfWriteBufferAddrIn  ,  microData  ,  cfReadBufferAddrOut  , 
etherBufferDataIn  ,  prepV  ,  prepi ; 
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reg[15:0]  cfWriteBufferDataIn  ; 

reg[29:0]  initCount  ;  //used  for  initialization 

reg[2:0]  adcState;  //state  for  adc  buffer  read  fsm 

reg  pendingCfWrite ;  //high  when  there  is  a  pending 

write  to  cf  card 

reg  cfWriteDone ;  //high  when  cf  write  completes 

reg  oldPrepNewEnvelope ;  //prepNewEnvelope  delayed  by  one 

cycle 

reg[3:0]  cfState  ;  //state  for  cf  write  fsm 

reg  [27:0]  cfLBA ;  //sector  address  of  cfCard 

reg [15:0]  envO  ,  envl  ,  env2 , env3 ,  //prep  envelopes 
env4 , env5 , env6 , env7 ; 

reg  [2:0]  etherState  ;  //state  for  ether  write  fsm 

reg  pendingEnv ;  //I  when  there  are  envelopes  waiting 

to  be  written  to  wifi  controller 
reg  [2:0]  envCount ;  //used  to  count  envelopes 

reg  initDone ;  //used  to  control  initialization 

//fsm  state  enum,  adc 
parameter  adcWaiting=3’b000 ; 
parameter  adcRead0=3’b001  ; 
parameter  adcReadl=3’b010  ; 
parameter  adcWrite0=3’b011 ; 
parameter  adcWritel=3’bl00  ; 
parameter  adcWrite2=3’bl01 ; 
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parameter  adcWrite3=3’bllO  ; 
parameter  adcUpdate=3’blll  ; 

//fsm  state  enum ,  cfWrite 
parameter  cfWriteWaiting=4’b0000  ; 
parameter  cfWriteB0a=4’b0001  ; 
parameter  cfWriteB0b=4’b0010  ; 
parameter  cfWriteBla=4’b0011  ; 
parameter  cfWriteBlb=4’b0100  ; 
parameter  cfWriteB2a=4’b0101 ; 
parameter  cfWriteB2b=4’b0110  ; 
parameter  cfWriteB3a=4’b0111  ; 
parameter  cfWriteB3b=4’bl000  ; 
parameter  cfWriteB4a=4’bl001  ; 
parameter  cfWriteB4b=4’bl010  ; 
parameter  cfWriteComDone=4’bl011 ; 
parameter  cfWriteWaitBusy=4’bll00  ; 
parameter  cfWriteEnd0=4’bll01  ; 
parameter  cfWriteEndl  =4’blll0  ; 

//fsm  state  enum,  ether  write 
parameter  etherWait  =3’b000  ; 
parameter  etherGet  =3’b001 ; 
parameter  etherWrite0=3’b010  ; 
parameter  etherWritel  — 3’bOll ; 
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parameter  etherWrite2=3’bl00  ; 
parameter  etherWriteS  =3’bl01  ; 
parameter  etherDoiie  =  3’bllO  ; 

parameter  initPer  =450000000;  //the  device  doesn’t  begin 
collecting  data  until  initPer  cycles  after  reset 
parameter  maxEnv=3’h7; 

//transfer  data  from  adc  buffer  to  cf  buffer  and  ethernet 
buffer 

always  @(posedge  elk)  begin 

initCount  <=(initCount<initPer )  ?  initCount+l’bl  :  initCount 
adcEnable<=(initCount=initPer )  ; 
if  ( (  reset==rbl)  ||  ( initCount<initPer  ) )  begin 

// initCount  <=20’b0 ; 
adcState<=adcWaiting  ; 

// pendingCfWrite  <=rb0 ; 
adcBufferRd  <=l’b0 ; 

/ /  cfWriteBufferWr  <=rb0  ; 

/ /  cfWriteBufferAddrIn  <=8’b0  ; 
prepNewSample<=rbO ; 
end 

else  begin 

/*  if  (initCount<initPer )  begin  //do  initializations  ,  setup 
cfWriteBuffer  for  test 
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cfWriteBufferAddrIn <=(initCount  [2:0]  =  =  3’bl00)  ? 

cfWriteBufferAddrIn  +  l’bl  :  cfWriteBufferAddrIn; 
cfWriteBufferDataIn  <={8’hf0  ,initCount  [10:3]}; 
cfWriteBufferWr  <=(initCount  [2:0]  =  =  3  ’  bill )  ; 
end 

else  begin*/ 

prepNewSample<=(adcState=adcWriteO )  ; 
adcBufferRd <=(adcState=adcReadO )  ; 

//cfWriteBufferWr  <=(adc  St  at  e=adc  Writ  eO )  ; 
case ( adcState ) 

adcWaiting:  adcState <=(adcBufferEmpty==l’bO  && 

pendingCfWrite==l’bO)  ?  adcReadO  :  adcWaiting; 
adcReadO  :  adcState<=adcReadl ; 

adcReadl  :  begin 

(prepl  ,  prepV}<=adcBufferDataOut ; 
adcState<=adcWriteO  ; 
end 

adcWriteO  :  adcState<:=adcWritel  ; 

adcWritel  :  adcState<=adcWrite2  ; 

adcWrite2  :  adcState<=adcWrite3  ; 

adcWrite3  :  adcState<=adcUpdate  ; 

adcUpdate:  adcState<=adcWaiting ; 

endcase 
//end 
end 
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end 


//transfer  data  from  cf  buffer  to  cf  card 
//byteO:  low  8  bits  of  sector  address 
//bytel :  next  higher  8  bits  of  sector  address 
//byte2;  next  higher  8  bits  of  sector  address 
//byteS:  4  zeros,  high  4  bits  of  sector  address 
//byte4:  the  literal  Olh 
always  @(posedge  elk)  begin 

cfIsReadMode<=rbO //temporary,  for  now  always  write 
//led<=(cfState=cfWriteWaitBusy )  ; 
i f ( reset )  begin 

cfWriteDone  <=l’bl ; 
cfLBA<=28’h0000101  ; 
cfEnable  <=l’bO ; 
microWrite <=l’bl  ; 
cfState<=cfWriteWaiting  ; 
end 

else  begin 

cfEnable  <=~(cfState=cfWriteEndl )  ; 
case ( cfSt  ate  ) 

cfWriteWaiting :  begin  //wait  for  pending  write  to  be 
asserted  and  cfBusy  to  clear 

cfState  <=(pendingCfWrite==l’bl  kk.  cfBusy==rbO 
)  ?  cfWriteBOa  :  cfWriteWaiting; 
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microWrite <=l’bl  ; 


end 

cfWriteBOa:  begin  //write  bO 

cfWriteDone  <=l’bO ; 
cfState<=cfWriteBOb  ; 
microData<=cfLBA  [7:0]; 
microWrite  <=l’b0 ; 
end 

cfWriteBOb:  begin  //write  bO 

cfWriteDone  <=l’b0 ; 
cfState<=cfWriteBla  ; 
microData<=cfLBA  [7:0]; 
microWrite  <=rbl  ; 
end 

cfWriteBla:  begin  //write  bl 

cfState<=cfWriteBlb  ; 
cfWriteDone  <=l’b0 ; 
microData<=cfLBA  [15:8] 
microWrite  <=l’b0  ; 
end 

cfWriteBlb:  begin  //write  bl 

cfWriteDone  <=l’b0 ; 
cfState<=cfWriteB2a ; 
microData<=cfLBA  [15:8] 
microWrite  <=l’bl ; 
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end 


cfWriteB2a:  begin  //write  b2 

cfState  <=cfWriteB2b  ; 
cfWriteDone  <=l’bO ; 
microData<=cfLBA  [23:16]; 
micro  Write  <=rbO ; 
end 

cfWriteB2b:  begin  //write  b2 

cfWriteDone  <=rbO ; 
cfState  <=cfWriteB3a ; 
microData<=cfLBA  [23:16]; 
microWrite  <=l’bl ; 
end 

cfWriteB3a:  begin  //write  b3 

cfState<=cfWriteB3b  ; 
cfWriteDone  <=l’bO ; 
microData<— {4’hO  ,  cfLBA  [27:24]}; 
microWrite  <=l’bO ; 
end 

cfWriteB3b:  begin  //write  b3 

cfWriteDone  <=l’bO ; 
cfState<=cfWriteB4a ; 
microData<={4’hO  , cfLBA  [27 :24]  }  ; 
microWrite  <=rbl ; 
end 


136 


cfWriteB4a:  begin  //write  b4 

cfState<=cfWriteB4b  ; 
cfWriteDone  <=rbO ; 
microData  <=8’h01 ; 
microWrite  <=l’bO ; 
end 

cfWriteB4b:  ■  begin  //write  b4 
cfWriteDone  <=l’bO ; 
cfState<=cfWriteComDone  ; 
microData  <=8’h01  ; 
microWrite  <=l’bl ; 
end 

cfWriteComDone :  begin  //do  nothing  for  one  cycle 
cfState<=cfWriteWaitBusy  ; 
cfWriteDone  <=rbO ; 
microWrite  <=l’bl ; 
end 

cfWriteWaitBusy :  begin  //wait  for  busy  to  clear 

cfState  <=(cfBusy==l’bO)  ?  cfWriteEndO  : 

cfWriteWaitBusy ; 
cfWriteDone  <=l’bO ; 
microWrite  <  =  l’bl ; 
end 

cfWriteEndO:  begin  //increment  Iba  ,  clear  cf  enable 

cfState<=cfWriteEndl  ; 
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cfLBA<=cfLBA+rbl ; 
microWrite  <=l’bl  ; 
cfWriteDone  <=rbl ; 
end 

cfWriteEndl  :  begin  //do  nothing  for  one 
cfState<=cfWriteWaiting  ; 
microWrite  <=l’bl ; 
cfWriteDone  <=l’bl ; 
end 

default  :  begin 

cfState<=cfWriteWaiting  ; 
end 

endcase 

end 

end 

//send  prep  envelopes  to  wifi  controller  and  c 
always  @(posedge  elk)  begin 

oldPrepNewEnvelope<=prepNewEnvelope  ; 
if ( reset )  begin 
pendingEnv  <=l’bO ; 
pendingCfWrite  <=l’bO  ; 
etherState<=etherWait ; 
et  her  Buffer  Wr  <=l’bO  ; 
cfWriteBuffer  Addrin  <=8’b0 ; 


cycle 


controller 
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cfWriteBufferWr  <=l’bO ; 
envCount  <=3’b0  ; 
initDone  <=rbO  ; 
end 

else  begin 

if  ( initCount<initPer  &&  initDone==rbO )  begin  //do 
initializations  ,  setup  cfWriteBuffer  for  test 
cfWriteBuffer  Addrin  <=(init  Count  [2:0]  =  =  3’blll)  ? 

cfWriteBuffer AddrIn  +  1 ’bl  ;  cfWriteBufferAddrIn  ; 
cfWriteBufferDataIn  <={8’lif0  ,  init Count  [10:3]}; 
cfWriteBufferWr <=(init Count  [2:0]==  =  3’bl00)  ; 
initDone<=(initCount  [2:0]  =  =  3 ’bill  &fe  cfWriteBufferAddrIn 
==8’tiff)  ; 

end 

else  begin 

etherBufferWr  <=(etherState=etherWritel  ||  etherState= 
etherWriteS )  ; 

cfWriteBufferWr <=(etherState=etherWritel )  ; 
i  f  ( prepNewEnvelope==l’bl  &&  oldPrepNewEnvelope==l’bO  Mz 
pendingEnv==l’bO)  begin  //if  a  new  set  of  envelopes 
just  arrived  ,  start  send  process 
pendingEnv<=l’bl  ; 
etherState<=etlierWait ; 
envCount  <=3’b0 ; 
end 
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else  begin 

case ( etherState ) 
etherWait  :  begin 

etherState  <=(pendingEnv==rbl 

pendingCfWrite==rbO)  ?  etherGet  : 
etherWait  ; 

pendingCfWrite<=cfWriteDone  ?  I’bO  : 
pendingCfWrite  ; 

end 

etherGet  :  begin 

{envO  ,  envl  ,  env2  ,  env3  ,  env4  ,  env5  ,  env6  ,  env7}< 
prepEnvelopes  ; 
etherState<=etherWriteO  ; 
end 

etherWriteO  :  be  gin 

etherState<=etherWritel  ; 
case  ( envCount ) 

3’bOOO:  begin 

etherBufferDataIn<=envO  [15:8]; 
cfWriteBufferDataIn<=envO  ; 
end 

3 ’bOOl :  begin 

etherBufferDataIn<=envl  [15:8]; 
cfWriteBufferDataIn<=envl  ; 
end 
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3’bOlO:  begin 


etherBufferDataIn<=env2  [15:8]; 
cfWriteBufferDataIn<=env2  ; 
end 

3 ’ bOll :  begin 

etherBufferDataIn<=env3  [15:8]; 
cfWriteBufferDataIn<=env3  ; 
end 

3’blOO:  begin 

etherBufferDataIn<=env4  [15:8]; 
cfWriteBufferDataIn<=env4  ; 
end 

3 ’ blOl :  begin 

ether  Buff  erD  at  aln<=env5  [15:8]; 
cfWriteBufferDataIn<=env5  ; 
end 

3’bllO:  begin 

etherBufferDataIn<=env6  [15:8]; 
cfWriteBufferDataIn<=env6  ; 
end 

3 ’ bill :  begin 

etherBufferDataIn<=env7  [15:8]; 
cfWriteBufferDataIn<=env7 ; 
end 

endcase 
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end 


etherWritel  :  begin 

etherState<=etherWrite2  ; 
end 

etherWrite2  :  begin 

etherState<=etherWrite3  ; 

cfWriteBuffer  AddrIn<=cfWriteBuffer  Addrin  +  1  ’ 
bl ; 

case (envCount) 

3  ’bOOO  :  etherBufferDataIn<=envO  [7:0]; 

3  ’bOOl  :  etherBufferDataIn<=envl  [7:0]; 

3  ’ bOlO  :  etherBufferDat aln<=env2  [7:0]  ; 

3  ’  bOll  :  etherBufferDataIn<=env3  [7:0]; 

3  ’blOO  :  etherBufferDataIn<=env4  [7:0]; 

3  ’  blOl  :  etherBufferDat aln<=env5  [7:0]; 

3  ’bllO  :  ether  Buff  erD  at  aln<=env6  [7:0]; 

3  ’bill  :  .  ether  Buffer  D  at  aln<=env7  [7:0]; 
endcase 
end 

etherWrite3  :  begin 

envCount<=envCount  +  l’bl ; 

etherState  <=(envCount=maxEnv)  ?  etherDone 
etherWriteO  ; 

end 

etherDone :  begin 
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etherState<=etherWait  ; 

pendingCfWrite<=(cfWriteBtiffer  Addrln==8’h00) 

pendingEnv<=l’bO ; 
end 


endcase 

end 

end 

end 

end 


always  @(posedge  elk)  begin 

cfReadBufferAddrOut<=cfReadBufferAddrOut  +  rbl ; 
end 

endmodule 
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module  prep  (  elk  ,  reset  ,  v  ,  i  ,  newSample  ,  sendRaw  ,  envelopes 


newEnvelope  ,  led  )  ; 
input  elk  ; 
input  reset  ; 
input  [7:0]  v ; 
input  [7:0]  i  ; 
input  newSample; 


//system  elk,  25MHz 
//global  reset  ,  active  high 
//voltage  sample 
//curent  sample 

//high  for  one  cycle  when  a  new  sample 

of  V  and  i  arrives 

input  sendRaw;  //used  to  force  sending  raw  data  instead 

of  prep  data 


output  [  1  2  7 :  0]  envelopes;  //spectral  envelopes  produced  by 
prep 

output  newEnvelope;  //high  for  one  cycle  when  a  new 

envelope  is  produced 
output  led  ; 
reg  led  ; 


wire[127:0]  envelopes; 
reg  newEnvelope  ; 
reg  [6:0]  basisAddr  ; 
reg  squareV ; 

phase  as  v 
reg  oldSquareV  ; 
reg  delSquareV ; 


//used  to  index  basis  sinusoids 
//square  wave  with  same  frequency  and 

//squareV  delayed  by  one  sample 
//squareV  delayed  by  one  clock  cycle 
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reg[4:0]  debounceCount  ;  //used  to  debounce  v 

reg  oldNewSample ;  //newSample  delayed  by  one  clock 

cycle 

reg  [2:0]  envState;  //state  for  envelope  fsm 

reg  signed  [22:0]  envAccO  , envAccl  ,  //accumulators  for  envelopes 
envAcc2  ,  envAccS  ,  envAcc4  , 
envAcc5  ,  envAccO  ,  envAccT  ; 

reg  [2:0]  envNum;  //current  envelope  being  worked  on 

reg  [6:0]  rawCount ;  //used  to  count  packets  for  raw 

transmission 

reg  squareVSync;  //sync  signal  produced  by  cleaned  v 

reg  [23:0]  numCycles ;  //period  of  squareV  in  cycles 

reg  [23:0]  cycleCount  ;  //used  to  produce  numCycles 

reg  [23:0]  addrCycleCount ;  //used  to  count  time  between 

address  increments 

reg  envFired  ;  //used  to  signify  envelope 

transmission 

reg  [127:0]  iRaw ;  //stores  raw  i 


//envelope  creation  fsm  state  enum 
parameter  wait0=3’b000  ; 
parameter  waitl=3’b001  ; 
parameter  getV=3’b010; 
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parameter  waitMult0=3’b011  ; 
parameter  waitMultl=3’bl00  ; 
parameter  getProd=3’bl01  ; 
parameter  done=3’bll0  ; 

parameter  numEnvelopes=4’h8 ;  //number  of  envelopes  to  compute 

//basisRom  module  and  connections 
wire  [63:0]  basisOut  ; 

basisRom  aBasisRom  (  basisAddr  ,  ~  elk  ,  basisOut )  ;  //~clk  used  to 
meet  setup  and  hold  times  of  RDM 

//prepMult  module  and  connections 

reg  signed  [7:0]  multInO  ,  multinl  ;  //multInO  is  i,  multinl  is  a 
basis  sinusoid 
wire  signed  [15:0]  multOut ; 

prepMult  aPrepMult  (~  elk  ,  multInO  ,  multinl  ,  mult  Out )  ; 

assign  envelopes  =  (sendRaw==l’bl )  ?  iRaw  :  {envAccO  [  1  9  :  4]  , 
envAccl  [  1  9  :  4]  ,  envAcc2  [  1  9  : 4]  ,  envAcc3  [  1  9  :  4]  ,  envAcc4  [  1  9  :  4]  , 
envAcc5  [  1  9  :  4]  ,  envAccG  [  1  9  : 4]  ,  envAcc7  [  1  9  :  4]  }  ; 

//generate  addr  to  basis  sinusoid  rom 
always  @(posedge  elk)  begin 
delSquareV<=squareV  ; 
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if(reset)  begin 

oldNewSample  <=l’bO ; 
squareV  <=l’bO ; 
debonnceCount  <=5’b0 ; 
basisAddr  <=7’b0 ; 
numCycles  <=24’h065B9A ; 
cycleCount  <=24’b0 ; 
end 

else  begin 

oldNewSample<=newSaniple ; 

/ /  produce  squareV 

if  (newSample==l’bl  kk  oldNewSample==l’bO)  begin  //if  a  new 
sample  was  received  ,  process  it 
oldSquareV<=squareV  ; 

//produce  squareV  by  passing  v  through  a  hysteretic 
comparator,  after  detecting  an  edge,  won’t 
//encounter  a  new  edge  for  at  least  32  samples 
if  (squareV==l’bO  kk  v>8’h7f  kk  debounceCount==5’bO) 
begin  //rising  edge 
squareV <=l’bl ; 
debounceCount  <=5’b00001  ; 
end 

else  if  (squareV==l’bl  kk  v<8’h80  kk  debounceCount==5’bO) 
begin  //falling  edge 
squareV  <=rbO ; 
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debounceCount  <=5’b00001  ; 
end 

else  begin  //no  edge 

debounceCount <=(debounceCount==5’bO)  ?  5’bO  : 
debounceCount  +  1 ’bl ; 

end 

end 

//produce  basisAddr  using  squareV 

i f  ( squareV==l’bl  &&  delSquareV==l’bO)  begin  //if  there  is 
a  rising  edge  in  squareV  ,  reset  basisAddr  ,  assures  sync 
basisAddr  <=7’b0  ; 
end 

else  begin 

// basisAddr <=(newSample==l’bl  &&  oldNewSample==rbO)  ? 
basisAddr  +  l’bl  :  basisAddr; 

basisAddr  <=(addr  Cycle  Count ==((numCycles>>4’h7 )  —  I’bl ) )  ? 
basisAddr  +  l’bl  :  basisAddr; 

end 

//produce  sync  pulse 
squareVSync<=(squareV!  =  delSquareV )  ; 

//produce  cycle  counts 
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cycleCount<=(squareV==l’bl  &&  delSquareV==l’bO)  ?  24 ’bO  : 
cycleCount +1 ’bl  ; 

numCycles<=(squareV==l’bl  &&  delSquareV==l’bO)  ?  (( 
cycleCount+numCycles)  >>rbl )  :  numCycles  ; 

addr  Cycle  Count  <=((addrCycleCount==((numCycles>>4’h7 )  —  1  ’bl ) 
)  II  (squareV==l’bl  &&  delSquareV==l’bO ) )  ?  24 ’bO  : 
addrCycleCount  +  1 ’bl ; 

end 

end 

//produce  envelopes 
always  @(posedge  elk)  begin 
led<=sendRaw ; 
if(reset)  begin 

newEnvelope<=l’bO ; 
envState<=done ; 
envNum<=3’bO ; 
rawCount<=7’bO  ; 
envFired  <=l’bO  ; 
envAccO <=23’sb0  ; 
envAccl  <=23’sb0  ; 
envAcc2<=23’sbO  ; 
envAcc3<=23’sbO  ; 
envAcc4<=23’sbO  ; 
envAccS <=23’sb0  ; 
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envAcc6<=23’sbO  ; 
envAcc? <=23’sb0  ; 
end 

else  begin 

if  (newSaniple==l’bl  &&  oldNewSample==l’bO)  begin  //if  a  new 
sample  was  received  ,  process  it 
multln0<=(i  >8’li7f )  ?  (i— 8’h80)  :  ( (  ~  ( 8  ’  h80— i ) ) +1 ’bl )  ; // 
convert  to  signed,  two’s  complement 
envState<=waitO  ; 
envNum<=3’bO ; 
rawCount<=rawCount  +  l ’bl ; 
newEnvelope<=l’bO ; 
end 

else  begin 

if  (sendRaw==l’bO)  begin//do  normal  prep  data 

newEnvelope<=(basisAddr==7’bO  &&  squareVSync==l’bl )  ; 
//if  we  just  cycled  to  data  for  next  envelope,  send 
current  envelope 
case ( envState ) 

waitO  :  envState<=waitl  ; 
waitl  :  envState<=getV ; 
getV :  begin 

case  (envNum) 

3  ’bOOO  :  multInl<=basisOut  [63:5  6]; 

3  ’bOOl  :  mult  Ini  <=basis  Out  [55:48]  ; 
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3  ’bOlO  :  multInl<=basisOut  [4  7:40]; 

3  ’  boil  :  multlnl<=basis0ut  [39:32]  ; 

3  ’  blOO  :  multlnl<=basis0ut  [31:24]; 

3  ’  blOl  :  mult  In  l<=b  as  is  Out  [23:16]; 

3’bllO:  multInl<=basisOut  [  1  5  :  8]  ; 

3 ’bill:  multInl<=basisOut  [7  :  0]  ; 
endcase 

envState<=waitMultO  ; 
end 

waitMultO  :  envState<=waitMultl  ; 
waitMultl  :  envState<=getProd  ; 
getProd:  begin 

case  (envNum) 

3’bOOO:  envAccO<=(basisAddr==:7’bO  && 

oldSquareV!  =  squareV)  ?  multOut  :  envAccO 
+multOut ; 

3’bOOl:  envAccl<=(basisAddr==7’bO  && 

oldSquareV!  =  squareV)  ?  multOut  :  envAccl 
+multOut  ; 

3’bOlO:  envAcc2<=(basisAddr==7’bO  && 

oldSquareV!  =  squareV)  ?  multOut  :  envAcc2 
+multOut ; 

3’bOll:  envAcc3<=(basisAddr==7’bO  &&: 

oldSquareV!=squareV)?  multOut  :  envAcc3 
+multOut ; 
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3’blOO:  envAcc4<=(basisAddr  ==7’b0  &fe 

oldSqxiareV!  =  squareV )  ?  multOut  :  envAccd 
+multOut ; 

3’blOl:  envAcc5<=(basisAddr==7’bO  &&: 

oldSquareV  !  =  squareV )  ?  multOut  :  envAccS 
+multOut ; 

3’bllO:  envAcc6<=(basisAddr==7’bO  && 

oldSquareV!  =  squareV)  ?  multOut  :  envAccG 
AmultOut ; 

3 ’bill:  envAcc7<=(basisAddr==7’bO 

oldSquareV!=squareV)  ?  multOut  :  envAcc7 
+multOut ; 

endcase 

en  vN  um<=en  vN  um  + 1  ’  b  1 ; 

envState <=(envNum==(numEnvelopes  —  1) )  ?  done 
:  waitl  ; 

end 

done:  envState<=envState  ; 
default  :  envState<=envState  ; 
endcase 
end 

else  begin  //send  raw  data 

newEnvelope<=(rawCount==7’h7f  &fe  envFired===l’bO)  ; 
envFired <=(rawCount==7’liO)  ?  I’bO  :  (rawCount==7’h7f  ? 

1  ’bl  :  envFired )  ; 
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case  ( rawCount  [6:3]) 

4’hO:  iRaw[127:120]  <=i  ; 
4’hl:  iRaw[119:112]  <=i  ; 
4’h2:  iRaw[lll:104]  <=i  ; 
4’h3:  iRaw[103:96]  <=  i  ; 
4’h4:  iRaw [95:88]  <=  i  ; 
4’h5:  iRaw[87:80]  <=  i  ; 
4’h6:  iRaw[79:72]<=i  ; 
4’h7:  iRaw[71:64]  <=i  ; 
4’h8:  iRaw[63:56]  <=  i  ; 
4’h9:  iRaw[55:48]  <— i  ; 
4’ha:  iRaw[47:40]  <=  i  ; 
4’hb:  iRaw[39:32]  <=  i  ; 
4’hc:  iRaw[31:24]  <=i  ; 
4’hd:  iRaw[23:16]  <=  i  ; 

4 ’he:  iRaw  [15:8]  <=  i  ; 

4  ’  hf  :  iRaw[7:0]  <=  i  ; 
endcase 
end 
end 
end 
end 

endmodule 
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//this  module  controls  the  ADC 

//when  enabled  ,  it  samples  both  input  channels  at  7.68KHz, 
sequential  sampling 
//and  stores  the  result  in  the  buffer 

module  adcController  (elk  ,  reset  ,  enable  ,  adc_db  ,  adc_cs  ,  adc_rd  , 
adc_wr  ,  adc-intrq  ,  adc_clk  ,  bufferData  ,  bufferWrite  ,  bufferClk)  ; 
input  elk;  //system  elk,  25MHz 

input  reset;  //global  reset  ,  active  high 

input  enable;  //asserted  to  enable  this  module 

input  adc-intrq  ;  //interrupt  request  from  adc  ,  active  low 

inout[7:0]  adc_db ;  //databus  to  adc 

output  adc-cs  ;  //chip  select  for  adc,  active  low 

output  adc_rd  ;  //read  for  adc,  active  low 

output  adc_wr ;  //write  for  adc,  active  low 

output  adc_clk;  //clock  for  adc,  has  1/16  freq  of  elk 

output  bufferClk;  //clock  for  fifo  buffer,  inverted 

adc-clk 

output  [15:0]  bufferData;  //data  to  be  written  in  fifo 
buffer 

output  bufferWrite;  //write  signal  for  fifo  buffer,  active 

high 

wire  [7:0]  adc_db  ; 
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wire  adc_cs  ,  adc_clk  ,  bufferClk  ; 
reg  bufferWrite  ,  adc_rd  ,  adc_wr  ; 
reg[15:0]  bufferData; 

reg  [7:0]  dataReg;  //data  to  be  written  to  adc_db 

reg  adcDatabusWrite ;  //asserted  to  drive  the  adc  databus 
with  dataReg 

reg [3:0]  outClkCount ;  //used  to  produce  output  clock  to  adc 
reg[2:0]  state;  //state  for  fsin 

reg  chan;  //stores  next  channel  to  be  read  from 

reg  [11:0]  sampleClkCount ;  //used  to  generate  sample  elk 

//config  params ,  used  to  select  between  adc  channels  and  setup 
sampling  mode 

parameter  chan0Config=8’bl0100100 ;  //left  justified  ,  10— bit  , 
unsigned  ,  single  ended  ,  chan  0 

parameter  chanlConfig  =  8’bl0100101  ;  //left  justified  ,  10— bit  , 
unsigned  ,  single  ended  ,  chan  1 

//fsm  state  enum 
parameter  setupConfig=3’b000  ; 
parameter  writeConfig  =3’b001  ; 
parameter  waitForInt  =3’b010  ; 
parameter  firstRead  =3’b011 ; 
parameter  pauseRead=3’bl00  ; 
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parameter  secondRead=3’bl01  ; 
parameter  nop=3’bll0; 

//number  of  elk  cycles  between  samples 
parameter  numCycles  =  1628; 

assign  adc_db=adcDatabusWrite  ?  dataReg  :  8’hZZ; 
assign  adc_clk=outClkCount  [ 3 ]  ; 
assign  adc_cs=~enable  ; 

assign  bufferClk=~clk  ;  //inverted  to  meet  setup  and  hold  times 
of  buffer 


//interact  with  adc 
always  @(posedge  elk)  begin 

outClkCount<=outClkCount  +  l ’bl ; 
bufferWrite <=(enable==l’bl  &&  st ate=secondRead  chan==l’ 
bl  &&  outClkCount==4’b0011 )  ; 

adcDatabusWrite<=(enable==l’bl  ( state=setupConfig  || 
state=writeConfig ) )  ; 

adc_wr  <=~(enable==l’bl  state=writeConfig  )  ; 

adc_rd  <=~(enable==l’bl  &&  ( state=firstRead  ||  state= 
secondRead) )  ; 
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if(reset)  begin 
chan<=l’bO  ; 
state<=nop ; 
dataReg<=8’b01011011 ; 
sampleClkCount  <=12’b0 ; 
end 

else  if  (enable)  begin 

sampleClkCount <=(sampleClkCount=numCycles)  ?  12 ’bO  : 

sampleClkCount  +  1 ’bl ; 
case (state ) 

setupConfig:  begin  //load  config  word  to  databus 

dataReg<=clian  ?  chanlConfig  :  ctianOConfig ; 
state  <=(outClkCount==4’b0100  &&  sampleClkCount 
<{8’b0 ,5  ’  bum  })  ?  writeConfig  : 
setupConfig  ; 

end 

writeConfig:  begin  //write  config  word  to  adc 

state  <=(outClkCount==4’b0100)  ?  waitForInt  : 
writeConfig  ; 

end 

waitForInt:  begin  //wait  until  int  strobes  low, 

indicates  conversion  is  finished 

state  <=(outClkCount==4’b0100)  ?  (adc_intrq  ? 
waitForInt  :  firstRead)  :  waitForInt; 

end 
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firstRead  : 


begin  //read  high  byte 


bufferData<=chan  ?  {adc_db  ,  bufferData  [7 :  0]  } 

{  buffer  Data  [  1  5  :  8]  ,adc_db}; 
state  <=(outClkCount==4’b0100)  ?  pauseRead  : 
firstRead  ; 

end 

pauseRead;  begin  //pause  between  reads 

state  <=(outClkCount==4’b0100 )  ?  secondRead  ; 
pauseRead  ; 

end 

secondRead:  begin  //read  low  2  bits  ,  write  high  byte 

to  buffer  when  both  channels  are  done 

state  <=(outClkCount==4’b0100 )  ?  setupConfig 
secondRead  ; 

chan<=(outClkCount==4’b0100 )  ?  ~chan  :  chan; 
end 

nop:  begin 

state  <=(outClkCount==4’b0100 )  ?  setupConfig 
nop ; 

end 

endcase 

end 

else  begin 
state<=nop ; 
dataReg<=8’b01011011 ; 
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end 


end 

endmodule 
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//this  module  controls  the  cf  card 

//when  issuing  a  load  buffer  command,  data  should  be  of  the  form 

//byteO:  low  8  bits  of  sector  address 

//bytel:  next  higher  8  bits  of  sector  address 

//byte2:  next  higher  8  bits  of  sector  address 

//byte3:  4  zeros  ,  high  4  bits  of  sector  address 

//byte4;  number  of  sectors  to  read 

// 

//when  issuing  a  write  sector  command,  data  should  be  of  the 
form 

//byteO:  low  8  bits  of  sector  address 
//bytel:  next  higher  8  bits  of  sector  address 

//byte2:  next  higher  8  bits  of  sector  address 

//byteS:  4  zeros,  high  4  bits  of  sector  address 
//byte4:  the  literal  Olh 

module  cfController  (  elk  ,  reset  ,  readEnable  ,  microWrite  ,  microData  , 
csO  ,  csl  ,  addr  ,  cfData  ,  read  ,  write  ,  cfReg  ,  cfReset  ,  intrq  ,  debug  , 
bufferData  ,  bufferAddr  ,  bufferWr  ,  busy  ,  isReadMode  , 
writeBuffer Addr  ,  writeBufferData  ,  led)  ; 
input  elk;  //system  elk 
input  reset;  //global  reset; 

input  readEnable;  //asserted  to  enable  read  data  routine 
input  microWrite;  //write  line  from  micro,  active  low 
input  [7:0]  microData;  ///micro  databus 
input  intrq;  //interrupt  request 
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input  isReadMode ;  //I  for  read  mode,  0  for  write  mode 

input  [15:0]  writeBufferData ;  //write  buffer  ram  data  bus 

inout  [15:0]  cfData;  //databus  to  compact  flash 

output  csO ;  //used  to  access  task  file  ,  active  low 

output  csl ;  //used  to  access  alternate  status  register 

and  device  control  register  ,  active  low 
output  [2:0]  addr ;  //address 
output  read;  //read 

output  write;  //write 

output  cfReg ;  //not  used,  should  be  always  a  logic  1 

output  cfReset  ;  // reset  ,  active  low 

output  [15:0]  bufferData;  //buffer  ram  data  bus 

output  [7:0]  bufferAddr  ;  //buffer  ram  address 

output  [7:0]  writeBufferAddr  ;  //write  buffer  ram  address 

output  bufferWr;  //buffer  ram  write,  active  high 

output  busy;  //asserted  when  device  is  busy,  active  high 

output [3:2]  debug; 

output  led  ; 

reg  led  ; 

reg  busy; 

reg  [3:2]  debug ; 
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reg  oldWrite;  //write  delayed  by  1  clock  cycle 

reg  oldReadEnable ;  //readEnable  delayed  by  1  clock  cycle 

wire  cfReg  ,  cfReset  ; 

wire  [15:0]  cfData  ; 

reg  [2:0]  addr  ; 

reg  [7:0]  bufferAddr  ; 

wire  [7:0]  writeBufferAddr  ; 

reg  [15:0]  cfDataReg  ,  bufferData  ; 

reg  read  ,  write  ,  csO  ,  csl  ,  bufferWr  ; 

reg  cfDatabusWrite ;  //asserted  to  write  to  cf  databus  ,  active 
high 

reg  [4:0]  cfCmd ;  //stores  current  command 

reg  [4:0]  oldCfCmd ;  //stores  command  at  previous  cycle 

reg[27:0]  Iba;  //stores  logical  block  address 

reg [7:0]  sectorCount  ;  //stores  sector  count 

reg  [9:0]  cfCount ;  //temporary 

reg  [19:0]  initCount  ;  //used  to  achieve  pause  at  powerup 
reg  [6:0]  cfState  ;  //used  to  maintain  state  information  of  FSM 
reg  tempBusy ;  //temporarily  stores  busy  bit 
reg  [2:0]  packetCount ;  //counts  data  packets 
reg  pendingRead ;  //asserted  when  there  is  a  pending  read 
request 

reg  [7:0]  wordCount ;  //used  to  count  words  during  transfer 
cycle 

reg  [7:0]  numSectorsRead ;  //counts  number  of  sectors  read 
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assign  cfReset=~reset  ; 
assign  cfReg  =  l’bl; 

assign  cfData=  ( cfDatabusWrite==l’bl )  ?  cfDataReg  :  16’hZZZZ; 
assign  writeBuffer Addr=buffer Addr  ; 

parameter  initPer  =50000;  //the  device  remains  innactive  for 
initPer  cycles  after  reset 
parameter  readSectorCmd=8’h20 ;  //read  with  retries 
parameter  writeSectorCmd=8’h30 ;  //write  with  retries 
parameter  readBnfferCmd=8’he4 ;  //read  buffer 

//enumeration  of  commands,  code  00010  currently  unused,  was 
previously  read  alt  status 
parameter  nop  =  5’b00000  ; 
parameter  initCard  =5’b00001  ; 
parameter  readStatus=5’b00011  ; 
parameter  readTest=5’b00100  ; 

parameter  read0=5’b00101 ;  //wait  for  busy  to  clear 
parameter  readl=5’b00110 ;  //write  to  card/head  register 
parameter  read2=5’b0011 1  ;  //wait  for  busy  to  clear  and  drdy  to 
assert 

parameter  read3=5’b01000 ;  //send  cylinder  high 
parameter  read4=5’b01001 ;  //send  cylinder  low 
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parameter  read5  =5’b01010 ;  //send  starting  sector 
parameter  read6=5’b01011  ;  //send  number  of  sectors 
parameter  read7=5’b01100 ;  //write  command  code  (read  sector) 
parameter  read8=5’b01101  ;  //wait  for  busy  to  clear  and  drq  to 
set 

parameter  read9=5’b01110 ;  //read  a  block  of  data 

always  @(posedge  elk)  begin 
led<=busy ; 

oldWrite<=microWrite  ; 
oldReadEnable<=readEnable  ; 

initCount  <=(initCount<initPer )  ?  initCount+l’bl  :  initCount 
if ( reset ) 

initCount  <=20’b0 ; 

if  (reset  ||  ( initCount<initPer  ) )  begin 

cfDatabusWrite  <=l’bO  ; 
csO  <=rbl ; 
csl  <=rbl ; 
read  <  =  l’bl  ; 
write  <=l’bl ; 
debug<=2’b0 ; 
lba<=28’lifffffff  ; 
cfDataReg  <  =  16’ h  ff f f  ; 
cfCount  <=10’b0 ; 
cfCmd<=i  nit  Card  ; 
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oldCfCmd<=nop ; 
addr <=3’bll0  ; 
tempBusy<=rbl ; 
bufferWr  <=rbO ; 
bufferData  <=16’b0 ; 
buffer  Addr  <=8’b0 ; 
packetCount  <=3’b0 ; 
pendingRead  <=l’bO ; 
wordCount<=8’bO ; 
busy<=rbl ; 
numSectorsRead  <=8’b0 ; 
end 

/  /  process  command 
else  begin 

//if  module  is  in  read  mode,  start  gathering  Iba  data  from 
micro 

i  f  (  readEnable )  begin 

//if  readEnable  just  stepped  high,  reset  packet  count 
i f  ( oldReadEnable==l’bO)  begin 
packetCount  <=2’b0  ; 
pendingRead <=l’bO ; 
end 

//if  write  just  stepped  low,  grab  a  data  packet 
else  i  f  ( (  oldWrite==l’bl )  &&  ( microWrite==rbO) )  begin 
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case  (packet Count) 

3’bOOO:  lba<={lba  [2  7 :  8]  ,  microData  }  ; 

3  ’ bOOl :  lba<={lba  [27:16]  ,  microData  ,  Iba  [7  :  0]  }  ; 
3’bOlO:  lba<={lba  [27  :  24]  ,  microData  ,  Iba  [  1  5  :  0]  }  ; 
3’bOll:  lba<=:{microData  [3  :  0]  ,lba[23:0]}; 
default:  sectorCount<=microData ; 
endcase 

packetCount<=(packetCount==3’bl00)  ?  3’blOO  : 
packetCount  +  l’bl ; 

pendingRead <=(packet Count ==3’bl 00)  ; 
end 
end 

oldCfCmd<=cfCmd ; 

//when  command  just  changed,  reset  state  and  tempBusy 
i  f  (  ~  ( cfCmd=oldCfCmd) )  begin 
cfState  <=7’b0 ;  . 
tempBusy <=l’bl ; 
busy  <=l’bl ; 
end 

//otherwise,  process  current  command 
else  begin 

cfState<=cfState  +  l’bl ; 
case  (cfCmd) 
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initCard  :  begin  //start  initializing  compact  flash  card 

by 

//reading  status  register  (not  alt  status) 

//nb:  this  is  functionally  the  same  as 
readStatus 

//however  upon  completion  of  this  command, 
readTest  is 
// called  ,  not  nop 
csO  <=rbO ; 
csl  <=rbl ; 
addr<=3’blll  ; 

read<=(cfState  [6]==  cfState  [5] )  ;  //asserted  for 
01,10  as  high  bits 
write  <=l’bl ; 
cfDatabusWrite  <=l’b0 ; 
cfDataReg  <=16’b0 ; 

cfCmd<=(cfState==7’blllllll  &&  debug[2]  =  =  l ’bl) 
?  readTest  :  cfCmd ;  //enters  readTest  at 
end 

debug  [2]  <  =  (  cfState==7’bl011111 )  ?  (~cfData[7] 
&&  cfData[6]  &&  cfData[4])  :  debug  [2];  // 

checks  that  ready  and  dsc  are  set  ,  and  busy 
is  cleared 
busy  <=rbl ; 
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readStatus :  begin  //reads  status  register  (not  alt 
status ) 

csO  <=rbO ; 
csl  <=rbl ; 
addr  <=3’blll  ; 

read<=(cfState  [6]  =  =  cfState  [5] )  ;  //asserted  for 
01 ,10  as  high  bits 
write  <=rbl ; 
cfD at abus Write  <=l’b0  ; 
cfDataReg  <=16’b0 ; 

cfCnid<=(cfState==7’blllllll )  ?  nop  :  cfCmd ;  // 
enters  nop  at  end 
busy<=l’bl  ; 
end 

readTest  :  begin  //performs  a  test  read 
lba<=28’h0000100  ; 
cfCmd<=readO  ; 
sectorCount  <=8’h01  ; 
busy<=l’bl  ; 
end 

readO  :  begin  //wait  for  busy  to  clear  ,  functionally 

the  same  as  readStatus 
csO  <=rb0 ; 
csl  <=rbl ; 
addr <=3’blll  ; 
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read<=(cfState  [6]  =  =  cfState  [5] )  ;  //asserted  for 
01,10  as  high  bits 
write  <=l’bl ; 
cfDatabusWrite  <=rb0 ; 
cfDataReg  <=16’b0 ; 

cfCmd<=(cfState==7’blllllll  &&  tempBusy==l’bO) 
?  readl  :  cfCmd ;  //enters  readl  at  end 
tempBusy<==(cfState==7’bl011111 )  ?  cfData[7]  : 

tempBusy ; 
busy  <=l’bl ; 
end 

readl:  begin  //write  to  card/head  reg 

csO  <=rb0 ; 
csl  <=l’bl ; 
addr <=3’bll0  ; 
read  <=l’bl ; 

write  <=(cfState  [6]==cfState[5]);  //asserted 
for  01,10  as  high  bits 
cfDatabusWrite  <=l’bl  ; 
cfDataReg<={8’bO,4  ’blllO  ,lba  [27:24]}; 
cfCmd<=(cfState==7’blllllll)  ?  read2  :  cfCmd ; 

//enters  read2  at  end 
busy  <=l’bl ; 
end 
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read2  :  begin  //wait  for  busy  to  clear  ,  and  ready  to 

assert 

csO  <=l’bO  ; 
csl  <=l’bl ; 
addr <=3’blll ; 

read<=(cfState  [6]  =  =  cfState  [5] )  ;  //asserted  for 
01,10  as  high  bits 
write  <=l’bl ; 
cfDatabusWrite  <=l’b0 ; 
cfDataReg<=16’bO ; 

cfCmd<=(cfState==7’blllllll  &&:  tempBusy==l’bO ) 
?  reads  ;  cfCmd ;  //enters  readS  at  end 
tempBusy<=(cfState==7’bl011111 )  ?  (cfData[7]  | 
~cfData[6])  :  tempBusy ; 
busy  <=l’bl ; 
end 

reads :  begin  //write  cylinder  high  reg 

csO  <=rb0 ; 
csl  <=rbl ; 
addr  <=S’bl01 ; 
read  <=l’bl ; 

write  <=(cfState  [6]==cfState[5]);  //asserted 
for  01,10  as  high  bits 
cfDatabusWrite  <=l’bl ; 
cfDataReg <={8’b0  ,lba[2S:16]}; 
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cfCmd<=(cfState==7’blllllll )  ?  read4  :  cfCmd ; 

//enters  read4  at  end 
busy  <=l’bl ; 
end 

read4  :  begin  //write  cylinder  low  reg 

csO  <=l’bO ; 
csl  <=rbl ; 
addr<=3’bl00; 
read  <=l’bl ; 

write  <=(cfState  [6]==  cfState  [5])  ;  //asserted 
for  01,10  as  high  bits 
cfDatabusWrite  <=l’bl ; 
cfDataReg  <={8’b0  ,lba[15:8]}; 

cfCmd<=(cfState==7’blllllll )  ?  read5  :  cfCmd ; 

//enters  readS  at  end 
busy  <=l’bl ; 
end 

reads  :  begin  //write  sector  number  reg 

csO  <=l’b0 ; 
csl  <=l’bl ; 
addr  <=3’b011  ; 
read  <=l’bl ; 

write<=(cfState  [6]==cfState  [5])  ;  //asserted 
for  01,10  as  high  bits 
cfDatabusWrite  <=l’bl ; 
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cfDataReg<={8’bO  ,  Iba  [7:0]}; 

cfCmd<=(cfState==7’blllllll )  ?  readb  :  cfCmd 
//enters  readb  at  end 
busy  <=rbl ; 
end 

readG  :  begin  //write  sector  count  reg 

csO  <=l’b0 ; 
csl  <=l’bl ; 
addr  <=3’b010  ; 
read  <=l’bl ; 

write  <=(cfSt  ate  [6]  =  =cfState  [5])  ;  //  asserted 
for  01,10  as  high  bits 
cfDatabusWrite  <=l’bl ; 
cfDataReg<=(isReadMode==l’bl )  ?  {8’bO, 
sectorCount}  :  { 15 ’bO  ,  1  ’  bl  }  ; 
cfCmd<=(cfState==7’blllllll )  ?  read7  :  cfCmd 
//enters  read7  at  end 
busy<  =  l’bl  ; 
end 

read7 :  begin  //write  read  sector  command 

csO  <=rb0 ; 
csl  <=rbl ; 
addr <=3’blll ; 
read  <=rbl ; 
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write  <=(cf  St  ate  [6]  =  =cfState[5]);  //asserted 
for  01,10  as  high  bits 
cfDatabusWrite  <=l’bl ; 

cfDataReg<=(isReadMode==l’bl )  ?  {8’bO, 

readSectorCmd}  :  {8 ’bO  ,  writeSectorCmd  }  ; 
cfCmd<=(cfState==7’blllllll )  ?  read8  :  cfCmd ; 

//enters  read8  at  end 
bufferAddr<=8’bO; 
busy<=l’bl ; 
numSectorsRead  <=8’b0 ; 
end 

read8 :  begin  //read  alt  status  until  busy  is 

cleared  and  drq  is  set 

//when  reading,  also  wait  for  interrupt  ,  when 
doing  a  write  ,  do  not  wait 
csO  <=rbl ; 
csl  <=l’b0 ; 
addr  <=3’bll0  ; 

read<=(cfState  [6]  =  =  cfState  [5] )  ;  //asserted  for 
01,10  as  high  bits 
write  <=l’bl  ; 
cfDatabusWrite  <=l’b0 ; 
cfDataReg  <=16’b0 ; 

cfCmd<=(cfState==7’blllllll  tempBusy==l’bO 
kk  (intrq==l’bl  ||  isReadMode==rbO) )  ? 
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read9  :  cfCmd ;  //enters  read9  at  end 
tempBusy<=(cfState==7’bl011111 )  ?  (cfData[7]  | 
~cfData[3])  :  tempBusy ; 
wordCount<=8’bO ; 
busy  <=rbl ; 
end 

read9 :  begin  //read  or  write  a  block  of  data 

csO  <=rbO ; 
csl  <=l’bl ; 
addr  <=3’b000  ; 

read<=(isReadMode==l’bO  ||  cfState  [6]==  cfState 
[5]);  //asserted  for  01,10  as  high  bits,  in 
read  mode 

write  <=(isReadMode==l’bl  ||  cfState  [6]  =  == 
cfState  [5]);  //asserted  for  01,10  as  high 
bits,  in  write  mode 
cfDatabusWrite<=~isReadMode ; 
cfDataReg<=(cfSt at e  ==7’b0000100 )  ? 

writeBufferData  :  cfDataReg ; 
numSectorsRead<=(cfState==7’bl  111110 

wordCount==8’bO)  ?  numSectorsRead  +  1 ’bl  : 
numSectorsRead  ; 

wordCount<=(cfState==7’blllll01 )  ?  wordCount 
+  l’bl  :  wordCount; 
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cfCmd<=(cfState==7’blllllll  &&  wordCouiit==8’bO 
)  ?  ( ( numSectorsRead>=sectorCount )  ? 
readStatus  :  readS)  :  cfCmd ;  //enters 
readStatus  or  reads  at  end 
bufferData<=(cfState==7’bl011111 )  ?  cfData  : 
bufferData  ; 

bufferWr <=(isReadMode==l’bl  Mz  cfState==7’ 
bllOOOOO) ; 

bufferAddr  <=(cfState==7’blllllll )  ?  bufferAddr 
+  l’bl  :  bufferAddr; 
busy  <=l’bl ; 
end 

default:  begin  //includes  nop 

csO  <=l’bl ; 
csl  <=l’bl ; 
addr  <=3’b0  ; 
read  <=l’bl  ; 
write  <=l’bl  ; 
cfDatabusWrite  <=l’bO  ; 
cfDataReg  <=16’b0 ; 
debug[3]  <  =  1  ’bl ; 

if  (pendingRead)  begin  //service  read  request 
cfCmd<=readO  ; 
pendingRead  <=l’bO ; 
end 
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busy  <=l’bO ; 


end 

endcase 

end 

end 

end 

endmodule 

//reads  two  bytes  of  data  from  compact  flash  buffer  ,  high  byte 
first 

//input  data  should  be  of  the  form 
//byte  0:  word  number 

module  cfBufferReader  ( elk  ,  reset,  enable,  microData  ,  dataReg  , 
write,  read,  bufferAddr  ,  bufferData); 

input  elk;  //system  elk 

input  reset;  //global  reset 

input  enable;  //I  when  module  is  aetive 

input  write;  //write  signal  from  mieroeontroller  ,  active  low 
input  read;  //read  signal  from  microcontroller  ,  active  low 
input  [15:0]  bufferData;  //buffer  data  out 
input  [7:0]  microData;  //microcontroller  databus 

output [7:0]  dataReg;  //output  register 

output  [7:0]  bufferAddr;  //buffer  read  addr  line 
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reg[7:0]  bufferAddr  ; 

reg  oldWrite;  //write  delayed  by  one  cycle 

reg  oldRead ;  //read  delayed  by  one  cycle 

reg  [7:0]  dataReg ;  //stores  data  to  be  driven  on  bus 

reg  highByte ;  //one  when  high  byte  of  word  is  being  requested 

reg  oldEnable  ;  //enable  delayed  by  one  cycle 

always  @(posedge  elk)  begin 
oldWrite<=write  ; 
oldRead<=read  ; 
oldEnable<=enable  ; 
if (reset )  begin 
highByte  <=l’bl  ; 
dataReg<=8’bO ; 
end 

else  begin 

i  f  (  enable  )  begin 

//if  enable  just  stepped  high,  reset  highByte 
if  ( oldEnable==l’bO)  begin 
highByte  <=l’bl ; 
end 

//if  write  just  stepped  low,  read  a  packet  of  data 
else  if  ((  write==rb0)  Mi  ( oldWrite==l’bl ) )  begin 
buffer  Addr<=inicroData ; 
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end 

//otherwise,  if  read  just  stepped  low,  load  dataReg 
appropriately 

else  i f  ( (  read==l’bO)  ( oldRead==rbl ) )  begin 

dataReg<=highByte  ?  bufferData  [  1 5  : 8]  :  bufferData 

17:0]; 

highByte<=~highByte  ; 
end 
end 
end 
end 

endmodule 

//writes  two  bytes  of  data  to  compact  flash  buffer,  high  byte 
first 

//input  data  should  be  of  the  form 
//byte  0:  word  number 
//byte  1:  high  data  byte 
//byte  2:  low  data  byte 

module  cfBufferWriter  ( elk  ,  reset,  enable,  microData  ,  microWrite  , 
bufferAddr  ,  bufferData  ,  bufferWrite  )  ; 
input  elk  ;  //system  elk 
input  reset;  //global  reset 
input  enable;  //I  when  module  is  active 
input  [7:0]  microData;  //microcontroller  databus 
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input  microWrite;  //write  signal  from  microcontroller,  active 


low 


output  [7:0]  bufferAddr  ;  //buffer  read  addr  line 

output  [15:0]  bufferData;  //buffer  data  out 

output  bufferWrite  ;  //buffer  write  line,  active  high 

reg[7:0]  bufferAddr; 
reg[15:0]  bufferData; 
reg  bufferWrite  ; 

reg  oldWrite ;  //microWrite  delayed  by  one  cycle 
reg  oldEnable ;  //enable  delayed  by  one  cycle 
reg  pendingWrite  ;  //I  when  write  to  buffer  is  pending 
reg  [1:0]  packetCount  ;  //counts  input  packets 

always  @(posedge  elk)  begin 
oldWrite<=microWrite  ; 
oldEnable<=enable  ; 
if(reset)  begin 

pendingWrite  <=rb0 ; 
bufferWrite  <=rb0 ; 
end 

else  begin 

if  (  enable  )  begin 
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//if  enable  just  stepped  high,  reset  packetCount  and 
pendingWrite 

i f  ( oldEnable==l’bO )  begin 
packetCount  <=2’b0 ; 
pendingWrite  <=l’bO ; 
end 

//if  write  just  stepped  low,  read  a  packet  of  data 
else  i  f  ( (  microWrite==l’bO )  &&  ( oldWrite==l’bl ) )  begin 
packetCount  <=(packetCount==2’bll )  ?  2’bll  : 

packetCount +  1 ’bl ; 
case  ( packetCount ) 

2’bOO:  buffer Addr<=microData ; 

2  ’bOl  :  bufferData<={microData  ,  bufferData  [7:0]}; 
2’blO:  bufferData<={bufferData  [  1  5  :  8]  ,  microData  }  ; 
default  :  bufferData<=bufferData  ; 
endcase 

pendingWrite <=(packet Count  ==2’bl0)  ; 
end 

else  begin 

pendingWrite  <=l’b0  ; 
end 

bufferWrite<=pendingWrite  ; 
end 

else  begin 

buffer  Write  <=rb0 ; 
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end 


end 

end 

endmodule 
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module  ether  Controller  ( elk  ,  reset  ,  rxd  ,  txd  ,  raodReset ,  fifoRead  , 
fifoData  ,  fifoEmpty  ,  fifoUsedWords  ,  sendRaw  ,  led  )  ; 
input  elk;  //system  elk,  25MHz 

input  reset;  //global  reset  ,  aetive  high 

input  rxd;  //serial  input  line  from  wifi  module 

input  [7:0]  fifoData;  //databus  from  fifo 
input  fifoEmpty;  //high  when  fifo  is  empty 

input  [9:0]  fifoUsedWords;  //number  of  data  words  in  fifo 


output  txd ; 
output  fifoRead  ; 
output  modReset ; 
output  sendRaw; 

of  prep  data 
output  led  ; 


//serial  output  line  to  wifi  module 
//read  line  for  fifo  ,  aetive  high 
//reset  line  for  wifi  module,  aetive  low 
//used  to  foree  sending  raw  data  instead 


reg  led  ; 
reg  fifoRead  ; 
reg  sendRaw; 

reg[2:0]  state;  //state  for  transmit  fsm 

reg  [31:0]  initCount  ;  //used  to  keep  module  inactive  for 
designated  amount  of  time  at  startup 
reg  [2:0]  cmd;  //used  to  select  current  command 


a 
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reg[7:0]  byteCount  ;  //used  to  count  bytes  in  send/recv 

stream 

reg[19;0]  waitCount ;  //used  to  wait  during  recv 
reg  sendingDataPacket ;  //used  to  keep  track  of  packet  type 
reg  commandReceived ;  //used  to  signify  a  command  has  been 
received  via  wifi 

reg  [7:0]  command;  //stores  command  from  wifi 

reg  justSentStatus ;  //used  to  record  sending  of  status 

updates 

wire  modReset ; 

parameter  initPer  =375000000; 

parameter  maxByteCount=8’h81  ;  //max  value  of  byteCount 
parameter  lastinit Addr  =7’h61  ;  //last  addr  of  init  rom  that 
stores  init  params 

parameter  readStart  =7’h62 ;  //first  addr  of  read  command 
parameter  readEnd=7’h6C ;  //last  addr  of  read  command 
parameter  preambleSt  art  =7’h6D ;  //first  addr  of  preamble 
parameter  preambleEnd  =  7’h7C ;  //last  addr  of  preamble 
parameter  etherBaud  =  38400; 


//transmit  state  enum 
parameter  waiting=3’b000 ; 
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parameter 

parameter 

parameter 

parameter 


readRq=3’b001  ; 
moveData=3’b010 ; 
sendData=3’b011 ; 
doneData=3’bl00  ; 


/ / receive 
parameter 
parameter 
parameter 


state  enum 
rWaiting=3’bl00  ; 
rMoveData=3’bl01  ; 
rWriteData  =  3’bllO  ; 


/  /  command 
parameter 
parameter 
parameter 
parameter 
parameter 


enum 

dolnit  =3’b000  ; 
doReadCmd=3’b001  ; 
doReadData=3’b010  ; 
doPreamble=3’b011  ; 
doSendData=3’bl00 ; 


//header  enum 

parameter  header_null  =8’ hff  ; 
parameter  header_data=8’h00  ; 
parameter  header_status=8’h01 ; 
parameter  header_debug=8’h02  ; 


//receive  command  enum 
parameter  command_start=8’h43  ; 
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parameter  command_transmit_no=8’h30 ; 
parameter  command_transmit_yes=8’h31  ; 
parameter  command_data_prep=8’h32  ; 
parameter  command_data_raw=8’h33  ; 
parameter  command_null=8’ hf ; 

// async_receiver  module  and  connections 
wire  rxDataReady  ,  rxDebug  ,  rxidle  ; 
wire  [7:0]  rxData ; 

async_receiver  aAsync_receiver(clk  ,  rxd  ,  reset  ,  rxDataReady  , 
rxData  ,  rxidle  ,  rxDebug)  ; 
defparam  aAsync_receiver  .  Baud=etherBaud  ; 

// async.transmitter  module  and  connections 
reg  txStart ; 
wire  txBusy ; 
reg  [7:0]  txData ; 

async-transmitter  aAsync.transmitter  (  elk  ,  txStart  ,  txData  , 
,  txBusy)  ; 

defparam  aAsync_transmitter  .  Baud=etherBaud  ; 

//etherlnit  module  and  connections 
reg  [6:0]  etberlAddr  ; 
wire  [7:0]  etherlData; 

etherlnit  aEtherInit(  etherlAddr  ,  ~  elk  ,  etherlData )  ; 


txd 
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//etherRecvFIFO  module  and  connections 
reg  [7:0]  rfifoln  ; 

reg  rfifoWrite  ,rfifoRead  ,  rfifoClr  ; 
wire  [7:0]  rfifoOut  ; 
wire  rfifoEmpty  ; 

etherRecvFIFO  aEtherRecvFIFO  (  rfifoln  ,  rfifoWrite  ,rfifoRead  ,  elk  , 
rfifoClr  ,  rfifoOut  ,  rfifoEmpty  )  ; 
assign  modReset=~reset  ; 

always  @(posedge  elk)  begin 

initCount  <=(initCount<initPer )  ?  initCount+l’bl  :  initCount  ; 
if  (reset  ||  initCount<initPer )  begin 
state<=waiting  ; 
fifoRead  <=l’b0 ; 
rfifoRead  <=l’b0 ; 
rfifoWrite  <=l’b0; 
rfifoClr  <=l’bl ; 
rfifoln  <=8’h55 ; 
txStart  <=l’b0 ; 
etherlAddr  <=7’b0 ; 
cmd<=doInit  ; 
byteCount  <=8’b0 ; 
led  <=l’b0 ; 

sendingDataPacket  <=rb0 ; 
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commandReceived<=l’bO ; 
command<=command_null ; 
sendRaw<=l’bO ; 
justSentStatus  <=l’bl  ; 
end 

else  begin 

led  <=(cmd=doPreamble)  ; 
case  (cmd) 

dolnit  :  begin  //initialize 
fifoRead  <=rbO ; 
rfifoRead  <=l’bO ; 
rfifoWrite  <=rbO; 
rfifoClr  <=rbO; 

txData<=(state=moveData)  ?  etherlData  :  txData; 
txStart  <=(state=sendData)  ; 

etherlAddr <=(state=waiting  etherIAddr> 

lastinit  Addr )  ?  readStart  :  ( (  state=nioveData) 

?  etherIAddr  +  1 ’bl  :  etherlAddr); 
cmd<=(st  ate=waitiiig  &&  etherIAddr>lastInit  Addr  )  ? 

doReadCmd  :  cmd ; 
case (state ) 

waiting:  state  <=(txBusy==l’bO)  ?  moveData  : 

waiting  ; 

moveData:  state <=sendData ; 
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sendData:  state  <=(txBusy==rbl )  ?  waiting  : 

sendData  ; 

default:  state<— waiting  ; 

endcase 

sendingDataPacket  <=rbl ; 
end 

doReadCmd:  begin  //send  read  command 
fifoRead  <=:l’bO ; 
rfifoRead  <=l’bO; 
rfifoWrite  <=l’bO; 
rfifoClr  <=rbl  ; 

txData<=(state==moveData)  ?  etherlData  :  txData; 
txStart  <=(state=sendData)  ; 

etherlAddr <=(state=moveData)  ?  etherlAddr +1 ’bl 
etherlAddr  ; 

cmd<=(state=waiting  &&  etherlAddr >readEnd )  ? 

doReadData  :  cmd; 
byteCount  <=8’b0 ; 
waitCount  <=20’b0 ; 
sendingDataPacket  <=l’bl ; 
case (state ) 

waiting:  state  <=(txBusy==l’bO  &&  etherIAddr<= 

readEnd)  ?  moveData  :  waiting; 
moveData:  state<=sendData ; 
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sendData:  state  <=(txBusy==l’bl )  ?  waiting  : 

sendData  ; 

default:  state<=waiting  ; 

endcase 

commandReceived<=l’bO  ; 
command<=command_null ; 
end 

doReadData:  begin  //read  data 
fifoRead  <=l’bO ; 
rfifoRead  <=l’bO; 
rfifoClr  <=rbO; 
etherIAddr<=preambleStart  ; 

waitCount<=(waitCount==20’ hfffff  ?  waitCount  : 
waitCount  +  1 ’bl )  ; 

cmd<=((waitCount  >20’ hfOOOO  ||  byteCount==8’ hff ) 
state=rWaiting)  ?  doPreamble  :  doReadData 

sendingDataPacket  <=(st  ate=rWriteData  (( 

byteCount==8’h03  &&  r  fifoln  !  =  8 ’h30  &&  rfifoln 
!  =  8’h4f  &&  justSentStatus==l’bO)  ||  ( 

byteCount==8’h04  &&  rfifoln  !  =  8 ’hOD  &&  rfifoln 
!  =  8’h4b  justSentStatus==l’bO) )  ?  I’bO  : 
sendingDataPacket);  //checks  if  the  third 
received  character  is  ’O’,  and  fourth  is  <CR 
>,  if  so,  send  a  data  packet;  todo :  do  this 
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better 


commandReceived<=(state=rWriteData  &&  (( 

byteCo'unt==8’h05  ||  byteCount==8’h06 ) ) )  ?  (( 
byteCount==8’h05 )  ?  (  r  fifoIn=cornmand_start ) 

:  ( (  byteCount==8’h06  &&  rfifoIn!  = 
command_start )  ?  I’bO  :  commandReceived ) )  : 
commandReceived ; 

cominand<=(state=rWriteData  &&  commandReceived 
==rbl  &&  ( ( byteCount==8’h07  ||  byteCount===8 
h08)))  ?  ( ( byteCount==8’h07)  ?  rfifoln  :  (( 
byteCount==8’h08  &&  rfifoln  !=command)  ? 
command_null  :  command))  :  command; 
rfifoln  <=(state=rMoveData)  ?  rxData  :  rfifoln  ; 
rfifoWrite  <=(state=rWriteData)  ; 
byteCount<=(state=rMoveData)  ?  byteCount  +  1 ’bl 
byteCount  ; 
case (state  ) 

rWaiting:  state  <=(rxDataReady==l’bl )  ? 

rMoveData  :  rWaiting; 
rMoveData:  state<=rWriteData ; 

rWriteData:  state<=rWaiting  ; 
default:  state<=rWaiting ; 

endcase 
end 

doPreamble:  begin  //send  preamble 
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fifoRead  <=l’bO ; 
rfifoRead  <=l’bO; 
rfifoWrite  <=l’bO; 
byteCount  <=8’b0 ; 
waitCount  <=20’b0 ; 

if  ( (  fifoUsedWords>maxByteCount)  ||  ( 

sendingDataPacket==rbO) )  begin 
txData<=(state=moveData)  ?  etherlData  : 
txData ; 

txStart  <=(state=sendData)  ; 

etherlAddr <=(state=moveData)  ?  etherlAddr +  1  ’ 
bl  :  etherlAddr  ; 

cmd<=(state=waiting  etherlAddr >preambleEnd 
)  ?  doSendData  :  cmd ; 
case (state ) 

waiting:  state  <=(txBusy==rbO  &&  etherlAddr 

<=preambleEnd )  ?  moveData  :  waiting; 
moveData :  state<=sendData ; 

sendData  :  state  <=(txBusy==l’bl )  ?  waiting 

:  sendData ; 

default:  state  <=waiting  ; 

endcase 
end 

case  (command) 

command_data_prep :  sendRaw<=l’bO ; 
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command-data_raw :  sendRaw<=l’bl  ; 
default:  sendRaw<=sendRaw ;  //nop 
endcase 
end 

doSendData:  begin  //transmit  data 

etherIAddr<=preambleStart  ; / /  readStart  ; 
rfifoWrite  <=rbO ; 

if  (sendingDataPacket==l’bl)  begin  //sending 
regular  data  packet 

fifoRead  <=(state=readRq  &&  (byteCount< 
maxByteCount  — I’bl ) )  ;  //only  read  during 
readRq 

rfifoRead  <=l’bO ; 

txData<=(state=moveData)  ?  ((byteCount= 
maxByteCount)  ?  8’hOD  :  ( ( byteCount^= 

maxByteCount  — I’bl)  ?  header-data  :  fifoData 
.  ) )  :  txData  ; 
justSentStatus<=l’bO; 
end 

else  begin  //sending  other  packet 
fifoRead  <=rbO ; 

rfifoRead  <=((state=readRq)  (byteCount< 
maxByteCount  — I’bl)  &&  ( rfifoEmpty==rbO) )  ; 
//only  read  during  readRq 
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txData<=(state=moveData)  ?  ((byteCount= 
maxByteCount)  ?  8’hOD  :  ((byteCount= 
maxByteCount  — I’bl )  ?  header.debug  :  (( 

rfifoEmpty==l’bl)  ?  8’hO  :  rfifoOut))  )  : 
txData ; 

justSentStatus  <=l’bl ; 

end 

txStart  <=(state=sendData)  ; 

waitCount<=(waitCount==20’ hf ff ff  ?  waitCount  : 
waitCount  +  1 ’bl )  ; 

byteCount<=(state=moveData)  ?  byteCount  +  1 ’bl  : 
byteCount ; 

//cmd<=(state=waiting  &&  byteCount>maxByteCount 
)  ?  doReadCmd  :  cmd ; 

cmd<=(st  at  e=doneData  &&  waitCount  >20  ’  hfOOOO  )  ? 
doReadCmd  :  cmd ; 

case ( state ) 

waiting:  state  <=(byteCount>maxByteCount)  ? 

doneData  :  ( ( (  fifoEmpty  ==l’b0  || 

sendingDataPacket==l’bO)  txBusy==l’bO)  ? 
readRq  :  waiting); 
readRq :  state<=moveData ; 

moveData:  state<=sendData ; 
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sendData : 


state  <=(txBusy==l’bl )  ?  waiting 


sendData ; 

doneData:  state<=doneData ; 
default:  state<=waiting ; 

endcase 
end 


endcase 

end 

end 

endmodule 
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